Click here to Skip to main content
15,667,436 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Below I have implemented a Linear regression algorithm using Least squares method, everything is working fine ,just that I am not getting how to plot the Best fit line on the scatterplot that I want to demonstrate at the end of the code.

Below is my code with dataset linked.

Dataset : Salary data - Simple linear regression | Kaggle[^]

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def PredictSalary(experience):
    df = pd.read_csv("Salary_Data.csv")

    x_values = np.array(df["YearsExperience"].values)
    y_values = np.array(df["Salary"].values)

    # Calculating Slope & Intercept using the formula.
    n = len(x_values)
    x_mean = np.mean(x_values)
    y_mean = np.mean(y_values)
    numerator_value = 0
    denominator_value = 0

    for i in range(n):
        numerator_value += (x_values[i] - x_mean) * (y_values[i] - y_mean)
        denominator_value += (x_values[i] - x_mean) ** 2

    m = numerator_value / denominator_value
    b = y_mean - (m * x_mean)

    # Parameter passed to the function.
    x = experience
    y = m * x + b

    y_rounded = round(y, 0)
    print("Salary for a Candidate with Experience of {} is {} ".format(x, y_rounded))

    # Plots the dataset in a scatterplot.
    plt.scatter(x_values, y_values, edgecolors="red", color="blue", alpha=0.6)
    plt.xlabel("Experience (years)")
    plt.title("Salary Per Month")


What I have tried:

I tried looking into few examples but didnt found any good ones.
Updated 6-Mar-21 12:00pm
Gerry Schmitz 7-Mar-21 14:08pm    
It's a straight line; plot the slope: y = m * x + b

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900