Ordinary Least Squares (OLS) Regression - Estimate R/L between stock Average Price and SMA Value
OLS Estimators

Ordinary Least Squares (OLS) Regression - Estimate R/L between stock Average Price and SMA Value

Ordinary Least Squares (OLS) is a way to figure out the line that best fits a bunch of data points. Imagine you have a scatterplot with dots all over it. You want to draw a straight line that comes as close as possible to all those dots. That’s what OLS does.

Scatter Data points with multiple Line

Learn more about linear regression

OLS looks at the distance between each dot and the line. It squares these distances (to get rid of negative signs) and adds them all up. Then, it tweaks the line a bit to minimize this total squared distance. The result is the best-fitting line.

Overview:

This project aims to analyze Reliance stock data using linear regression, particularly focusing on Ordinary Least Squares (OLS) method. The code calculates the best-fit line, determining the intercept and slope values. The data spans from January 1, 2020, until the present day, with a timeframe of one day (intraday).

This Python script performs Ordinary Least Squares (OLS) regression analysis using historical data of Reliance Stocks. It calculates the best intercept and slope values for a linear regression model.

The analysis is conducted between the average of Open, High, Low, and Close prices and the Simple Moving Average (SMA) trading indicator based on the average price.

Understand Equation Behind the OLS:

How to calculate Ordinary Least Squares (OLS)

Standard Equation to represent Linear Regression

  1. Set up your regression equation: Start with the equation for a straight line: ??=????+??
  2. Calculate the mean of your independent and dependent variables: Find the average of all your x-values (call it xˉ[x bar] ) and all your y-values (call it ??ˉ [y bar]).

3.Calculate the deviations: Subtract the mean of each variable from every data point.

For each data point (????,????) you’ll have:

4. Calculate the slope (m) :

Linear Regression Estimators

5.Calculate the y-intercept (b) :

Use the formula: ??=??ˉ???×??ˉ

6.Plug the slope and intercept into your regression equation: Once you’ve found ??m and ??b, you can use them to write your regression equation:

??^=????+??

  • y^ [y cap] represents the predicted value of y.
  • ?? is your independent variable.
  • m and b are the slope and y-intercept you found earlier.


Let’s Code the above in Python

Data Source:

The script uses historical data retrieved from Yahoo Finance API. The data spans from January 1, 2020, to the present date, with a timeframe of 1 day (intraday).

Download Stocks Data From Yahoo Finance API

import yfinance as yf


class FinancialData:

    def __init__( self ):
        print("Get Historical Data")
        print("Skipping major functions - Validation of Data, etc,etc")

    def get_historical_data( self,ticker = "RELIANCE.NS", starting_Date = '2020-01-01' ,last_date =  '2024-12-31' ):
        try:
            data = yf.download ( tickers =ticker , start = starting_Date , end = last_date )
            return data.reset_index()
        except Exception as e:
            print(F"Error Occured while Downlaoding Data from API : {e}")
            return []

        

Implementation:

Code Structure: The code is written in Python and follows a class-based structure. This structure encapsulates functions and variables, making the code organized and modular.

Importing Libraries: We import necessary libraries, primarily LinearRegression from sklearn for regression analysis and matplotlib.pyplot for visualization.

sklearn.linear_model [This library provides tools for fitting linear models, including Linear Regression ]
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt        

The Q_OLS class encapsulates the process of applying Ordinary Least Squares (OLS) regression analysis . The __init__ method initializes the class with historical data of Reliance stocks and triggers the application of the OLS regression model.

class Q_OLS:

    def __init__( self , df_data ):
        print("Processing Linear Regression Model to get the Best fit line [OLS ]")        

Data preprocessing:

The code calculates the average price (Avg) of the stocks using the formula: (Open + High + Low + Close) / 4.

Avg = (Open + High + Low + Close) / 4.

It then calculates the Simple Moving Average (SMA) trading indicator based on the average price. The period chosen for SMA calculation is 9 days.

We handle missing values by dropping records containing NaN values.

            #Apply complex calculation on data
            self.data  [ 'Avg' ] =  (  self.data  [ 'Open' ]  +  self.data  [ 'High' ] + self.data  [ 'Low' ] + self.data  [ 'Close' ]  ) /4
            period =  9
            ema_col_name =  F"SMA_{period}"
            self.data[ema_col_name] = self.data['Avg'].rolling(window=period).mean()
            """ remove nan values  record  """
            self.data.dropna( inplace = True)        

The __Apply_OLS_reg_model method calculates the average price and Simple Moving Average (SMA) trading indicator, applies the Linear Regression model, and prints the summary of the model.

def __Apply_OLS_reg_model( self ):
        try:

            #Apply complex calculation on data
            self.data  [ 'Avg' ] =  (  self.data  [ 'Open' ]  +  self.data  [ 'High' ] + self.data  [ 'Low' ] + self.data  [ 'Close' ]  ) /4
            period =  9
            ema_col_name =  F"SMA_{period}"
            self.data[ema_col_name] = self.data['Avg'].rolling(window=period).mean()
            """ remove nan values  record  """
            self.data.dropna( inplace = True)

            #  --- end calculation

            """  feature and targeted data """
            X =  self.data[[ema_col_name]]  #column_name_for_independent_variable
            y = self.data [ 'Avg' ] #column_name_for_dependent_variable

            # var = self.__variables()
            print("Variables : Test & Prepare")

            model = LinearRegression()

            result = model.fit(X,y)

            """ summary """
            print("Summary of OLS Model is [Y= mX + C]:")
            print ( 'Intercept:' , result.intercept_ )
            print ( 'Slope:' , result.coef_ [ 0 ] )
            print("     -----End-----    ")
            self.__View_output(X,  result.predict(X) , y )

        except Exception as e:
            print(F"Error Occured while Applying Model : {e}")        

Linear Regression Model:

Variables:

  • X: The independent variable (X) is set as the SMA values.
  • y: The dependent variable (y) is set as the average price.

Output:

  • Intercept: The intercept value of the regression line.
  • Slope: The slope value of the regression line.

Getting Model Summary [Output]: We fit our data into the Linear Regression model and get some important numbers: the intercept (where our line crosses the y-axis) and the slope (how steep our line is). These numbers help us understand the relationship between the average price and the SMA.


 Model 2 
Get Historical Data
Skipping major functions - Validation of Data, etc,etc
[*********************100%***********************]  1 of 1 completed
number of Record Found is  : 1074
Processing Linear Regression Model to get the Best fit line [OLS ]
Variables : Test & Prepare
Summary of OLS Model is [Y= mX + C]:
Intercept: 11.692441031825638
Slope: 0.9971953168155703
     -----End-----           
Terminal Output

Visualizing the Results:

Finally, we create a plot to visualize our results. This plot shows the actual average prices plotted against the predicted average prices based on our Linear Regression model. This helps us see how well our model fits the data.

Complete python Code of Above model:


"""  Developer Details
Name  : Kamal Kumar Chanchal

"""


#Step 1: Import necessary libraries
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt


class Q_OLS:

    def __init__( self , df_data ):
        print("Processing Linear Regression Model to get the Best fit line [OLS ]")
        self.data = df_data

        self.__Apply_OLS_reg_model()


    def __Apply_OLS_reg_model( self ):
        try:

            #Apply complex calculation on data
            self.data  [ 'Avg' ] =  (  self.data  [ 'Open' ]  +  self.data  [ 'High' ] + self.data  [ 'Low' ] + self.data  [ 'Close' ]  ) /4
            period =  9
            ema_col_name =  F"SMA_{period}"
            self.data[ema_col_name] = self.data['Avg'].rolling(window=period).mean()
            """ remove nan values  record  """
            self.data.dropna( inplace = True)


            #  --- end calculation

            """  feature and targeted data """
            X =  self.data[[ema_col_name]]  #column_name_for_independent_variable
            y = self.data [ 'Avg' ] #column_name_for_dependent_variable

            # var = self.__variables()
            print("Variables : Test & Prepare")

            model = LinearRegression()

            result = model.fit(X,y)

            """ summary """
            print("Summary of OLS Model is [Y= mX + C]:")
            print ( 'Intercept:' , result.intercept_ )
            print ( 'Slope:' , result.coef_ [ 0 ] )
            print("     -----End-----    ")
            self.__View_output(X,  result.predict(X) , y )

        except Exception as e:
            print(F"Error Occured while Applying Model : {e}")

    def __View_output( self , X ,results,y):
        try:
            plt.scatter ( X , y , color = 'blue' , label = 'Actual data' )
            plt.plot ( X , results , color = 'red' , label = 'Regression line' )
            plt.xlabel ( 'Avg Price' )
            plt.ylabel ( 'SMA Values' )
            plt.title ( 'OLS Regression Analysis' )
            plt.legend ( )
            plt.show ( )
        except Exception as e:
            print(F"Error Occured while Plotting Summary Of OLS Regression model {e}")        

Get Complete Machine Learning Repository on Quant from my GitHub:

GitHub permalink: https://github.com/Coderixc/MachineLearning/blob/437b1fb88571c7123ca480aa73ee18b3848795c6/OrdinaryLeastSquares.py

Calling above code:

The project_2() function is defined, which is the main focus of the script. It fetches historical stock data for Reliance using a predefined function. it demonstrates the use of the Q_OLS class from the OrdinaryLeastSquares module to perform Linear Regression analysis on the fetched data.


#import LinearRegression as P
import OrdinaryLeastSquares
# import Ml2 as t

import getHistoricalData as feed



def project_2():
    print(" Model 2 ")
    """ Get Stocks Data -- Reliance """
    _t = feed.FinancialData()
    df_stocks_data =_t.get_historical_data()
    print(F"number of Record Found is  : {len(df_stocks_data)}")
    m2 = OrdinaryLeastSquares.Q_OLS(df_stocks_data)



# # Press the green button in the gutter to run the script.
if __name__ == '__main__':

    """ Linear Regression to On Mean Price """
    # myproject()
    """ model 2 : get best fit Line (Reliance Stocks)  """
    project_2()
        

Thank you for taking the time to read this post. If you found it informative or interesting, please consider clapping to show your appreciation!

Reference:

1.: Microsoft Word — lecture 8.docx (sfu.ca)

2. Simple Moving Average (SMA): What It Is and the Formula (investopedia.com)


??LinkedIn: https://www.dhirubhai.net/in/kamalchanchal

??Gmail : [email protected]

??You can also read my other Post Like:BackTesting Strategy Setup: Building a Python Trading Strategy Analyzer

??View Indicators Value in Trading System with C# and WinForms

??Black-Scholes in C# Options Pricing Model

??Algorithmic Trading : USE Tick Data to OHLC Candlesticks with Python

??Algorithmic Finance : View Historical Index (Nifty 50) Wick to Wick

??Explore the full potential of this project by visiting our GitHub repository.

Subscribe for more updates on Algorithmic Trading, financial analysis, and coding adventures using C# and Python. Thanks for reading!

Let’s stay connected and continue the conversation.

要查看或添加评论,请登录

Kamal K Chanchal的更多文章

社区洞察

其他会员也浏览了