own implementation of simple linear regression
Own Implementation of Simple Linear Regression

Own Implementation of Simple Linear Regression

  • Post category:Data Science
  • Post comments:6 Comments
  • Reading time:10 mins read

In this blog, we are talking about our own implementation code of Simple Linear Regression.

First we need to know about the Hypothesis Equation of Simple Linear Regression, for building our own implementation code.
The Hypothesis Equation is: y = a + b * x
      where, a is the intercept,
      b is the slop and
      x is the independent variable.

So, for calculating y, we have to calculate intercept and slop first.

Intercept formula, a = Ȳ – b * X̄
      where, Ȳ is the y mean (dependent variable mean),
      b is the slop and
      is the X mean (independent variable mean).

and Slop formula, b = COVARIANCE( X, y )/VARIANCE( X )


Step 1: In this step, for implementing own linear regression code, we need some own implementation functions like,

  1. LEN,
  2. SUM,
  3. MEAN,
  4. VARIANCE,
  5. COVARIANCE,
  6. Y_PREDICTION,
  7. ABSOLUTE,
  8. MEAN_ABSOLUTE_ERROR,
  9. MEAN_SQUARED_ERROR and
  10. R_SQUARED


CODES:

1. Len calculation function

#1 LENGTH CALCULATION FUNCTION

def LEN(List):
    n=0
    for i in List:
        n+=1    
    return n



2. Sum calculation function

#2 SUM CALCULATION FUNCTION

def SUM(List):
    if type(List[0])==str:
        st=''
        for i in List:
            st+=i
        return st
    SUM=0
    for i in List:
        SUM+=i    
    return SUM



3. Mean calculation function

#3 MEAN CALCULATION FUNCTION

def MEAN(List):
    return SUM(List)/LEN(List)



4. Variance calculation function

#4 VARIANCE CALCULATION FUNCTION

def VARIANCE(List):
    return SUM([(x-MEAN(List))**2 for x in List])/(LEN(List)-1)



5. Covariance calculation function

#5 COVARIANCE CALCULATION FUNCTION

def COVARIANCE(list1, list2):
    return SUM([(list1[i]-MEAN(list1))*(list2[i]-MEAN(list2)) for i in range(0,LEN(list1))])/(LEN(list1)-1)



6. Y-Prediction calculation function

#6 Y-PREDICTION CALCULATION FUNCTION

def Y_PREDICTION(a, b, x):
    return [a+b*x[i] for i in range(0,LEN(x))]



7. Absolute calculation function

#7 ABSOLUTE CALCULATION FUNCTION

def ABSOLUTE(value):
    if (value<0):
        value*=-1
    return value



8. Mean-Absolute-Error calculation function

#8 MEAN ABSOLUTE ERROR CALCULATION FUNCTION

def MEAN_ABSOLUTE_ERROR(y_pred, y_test):
    return SUM([ABSOLUTE(y_pred[i]-y_test[i]) for i in range(0,LEN(y_test))])/LEN(y_test)



9. Mean-Squared-Error calculation function

#9 MEAN SQUARED ERROR CALCULATION FUNCTION

def MEAN_SQUARED_ERROR(y_pred, y_test):
    return SUM([(y_pred[i]-y_test[i])**2 for i in range(0,LEN(y_test))])/LEN(y_test)



10. R-Squared calculation function

#10 R SQUARED CALCULATION FUNCTION

def R_SQUARED(y_pred, y_test, yMean):
    return 1-((SUM([(y_test[i]-y_pred[i])**2 for i in range(0,LEN(y_test))]))/(SUM([(y_test[i]-yMean)**2 for i in range(0,LEN(y_test))])))




Step 2: Now we have to import our dataset and put our independent variable into X and dependent variable into y. Get the dataset from this Link – Dataset.csv

import pandas as pd

df = pd.read_csv('SimpleLinearRegression.csv')
print(df.head())

X = df['YearsExperience']
y_hat = df['Salary']



Step 3: We split our dataset into train-test and reset their index with drop=True

from sklearn.model_selection import train_test_split

X_train,  X_test, y_train, y_test = train_test_split(X, y_hat, test_size = 1/3, random_state = 42 )

X_train = X_train.reset_index(drop=True)
X_test = X_test.reset_index(drop=True)
y_train = y_train.reset_index(drop=True)
y_test = y_test.reset_index(drop=True)



Step 4: Lastly, calculating intercept, a and slop, b by using their formula and also calculating Y-Prediction, MAE, MSE and R-Squared value by our own implementation functions.

b = COVARIANCE(X_train,y_train)/VARIANCE(X_train)
a = MEAN(y_train)-b*MEAN(X_train)

y_pred = Y_PREDICTION(a, b, X_test)

mae = MEAN_ABSOLUTE_ERROR(y_pred, y_test)
mse = MEAN_SQUARED_ERROR(y_pred, y_test)
r_squared = R_SQUARED(y_pred, y_test, MEAN(y_test))

print('Mean Absolute Error :',mae)
print('Mean Squared Error  :',mse)
print('R-Squared Value     :',r_squared)

Bonus: We can also plot our Linear Regression graph. In this code below, I plotted graph for training dataset. 
# Salary vs. Years of experience (For Training Set)

import matplotlib.pyplot as plt

plt.scatter(X_train, y_train, color='red')
plt.plot(X_train, Y_PREDICTION(a, b, X_train), color='blue')
plt.title('Salary vs. Years of experience (For Training Set)')
plt.xlabel('Years of experience')
plt.ylabel('Salary')
plt.show()

Training dataset graph
I also plotted graph for Testing dataset.
# Salary vs. Years of experience (For Test Set)

plt.scatter(X_test, y_test, color='red')
plt.plot(X_test, y_pred, color='blue')
plt.title('Salary vs. Years of experience (For Test Set)')
plt.xlabel('Years of experience')
plt.ylabel('Salary')
plt.show()

Testing dataset graph



If you want to learn, how to implement Simple Linear Regression with Python built-in functions then click here.

If you want to learn, Datacamp course – “Supervised Learning with scikit-learn” – then click here.

Thank you for reading my blog. If you have any query about this code, feel free to ask by comment. Thank you again.

This Post Has 6 Comments

  1. Merle Chhum

    Insightful piece

  2. https://selfpubbookcovers.com/blog/wp-content/pgs/betwinner_promo_code_india_3.html

    I am no longer sure where you are getting your info,
    however great topic. I needs to spend some time finding out more or
    understanding more. Thank you for wonderful information I used to
    be in search of this information for my mission.

    1. Mahmud Alam

      Thank you for your kind words! I’m delighted that you found the topic intriguing. If you need further information or clarification on anything, I’m here to assist you. Feel free to reach out anytime as you continue on your mission. 😊

  3. Shane Kampner

    Excellent write-up

    1. Mahmud Alam

      Thank you very much! I’m glad you enjoyed the write-up. If you have any more questions or need further assistance, feel free to reach out. I’m here to help! 😊

Leave a Reply