Cost function and Gradient Descent

Cost function and Gradient Descent

In the world of machine learning, understanding the concepts of cost function and gradient descent is essential for developing accurate predictive models. These two components work hand in hand to optimize our models and enhance their predictive power. Let's delve into what they are and how they work.

A cost function is a measure of how well our model's predictions match the actual data. Errors, which are actually differences between actual and predicted value, are calculated through cost function. The greater the error or we can say, the greater the value of the cost function, the lesser the accuracy of the model.?

Depending on the type of machine learning problem, different cost functions are used such as:

  • Mean Squared Error (MSE)

  • Root Mean Squared Error (RMSE)
  • Mean Absolute Error (MAE)
  • R2 Accuracy

MEAN SQUARED ERROR:

It calculates the average of the squared differences between the predicted and actual values. Squaring the errors ensures that negative values do not cancel out positive values.

?

Our main goal is to minimize the error or to minimize the value of that cost function and the technique for that is Gradient Descent.

Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) that minimize the cost function. The predicted line is calculated from y=mx+c and the goal is to calculate the best value of “m” and “c” to make a best-fit line where we start making a line from the value of “c” and then proceed using the value of “m” following RISE/RUN concept.

Steps Involved in Gradient Descent:

  • Start with initial parameter values (e.g., slope and intercept in linear regression).
  • Make a predicted line using these parameters and calculate MSE.
  • Update the parameters then again draw a line and calculate the MSE
  • Repeat the process until the change in the cost function is negligible or until a predefined number of iterations are completed.

But the question that arises here is: how to update the parameters?

Well, there are two approaches here. One is fixed and the other is Variable.

In the fixed approach, we update values of m and c in fixed amounts but there are chances of losing the best value of “m” and “c” because MSE is minimum only for some specific amounts of these parameters otherwise it again starts increasing on increasing the values of “m” and “c” from that specific value. Consider the following graph to understand in the best way:


Consider the y-axis as MSE and the x-axis as m and c. At point A, we have a maximum value of MSE due to randomly assumed values of parameters but slowly changing the parameters, we got changes in MSE value. In the variable approach, we change parameter values on the basis of some calculations and that formula is shown below:


LR is the learning rate and its value is mostly 0.001

PD(m) is the partial derivative of m

PD(C) is the partial derivative of c

We will use different Iterations until we reach point B means until we reach the lowest Value of MSE. But iterations are also made carefully because if we take a lot of iterations, the end of the last iteration may give us the greater value of MSE as compared to the value present at point B.

Here is the practical implementation of it:


for 100 iterations, you will see the output like this:


In summary, understanding cost functions and gradient descent is crucial for optimizing machine learning models. The cost function measures the error between predicted and actual values, while gradient descent is an iterative method used to minimize this error by adjusting model parameters. Mastery of these concepts allows for the development of accurate and efficient predictive models, forming a foundational skill set for any aspiring data scientist or machine learning practitioner.

要查看或添加评论,请登录

Nimra Iman的更多文章

  • DAX in Power BI made easy

    DAX in Power BI made easy

    In this third article of the Power BI series, we dive into the world of DAX (Data Analysis Expressions) functions…

  • Data Modeling in Power BI

    Data Modeling in Power BI

    Our previous article explored how to import and preprocess data in Power BI, setting the foundation for creating…

  • Data Importing and Preprocessing Basics in POWER BI

    Data Importing and Preprocessing Basics in POWER BI

    Power BI has become a game-changer for data analysis and visualization, allowing users to import data from multiple…

    1 条评论
  • Silhouette Score- concept + practical

    Silhouette Score- concept + practical

    When diving into the world of unsupervised learning, particularly clustering, it’s crucial to validate whether the…

  • DECORATORS in python

    DECORATORS in python

    A decorator is a function that takes another function and makes some changes to it before returning it. Imagine you…

社区洞察

其他会员也浏览了