Exploring Gradient Descent: A Step-by-Step Implementation in R

Exploring Gradient Descent: A Step-by-Step Implementation in R

Introduction:

Gradient descent is a fundamental optimization algorithm used in machine learning and numerical optimization to find the minimum of a function. In this article, we'll delve into a hands-on implementation of gradient descent using the R programming language. Our goal is to minimize the function f(x)=1.4*(x-2)^2+3.2 through a step-by-step process, shedding light on the key parameters and decisions involved in the algorithm

Understanding the Objective:

The function f(x) serves as our optimization target, and its gradient (derivative with respect to x) is computed in the "grad" function. The algorithm starts by initializing crucial parameters, including the number of iterations, the stopping threshold, the initial value of x, and the learning rate.

f <- function(x) {
       1.4 * (x-2)^2 + 3.2
}

grad <- function(x){
           1.4*2*(x-2)
}

iterations <- 100 
threshold <- 1e-5
stepSize <- 0.05 
x <- -5
xtrace <- x 
ftrace <- f(x) 
        

Visualizing the Function:

To gain insights into the function f(x) , we visualize it over a range of x values. A plot is generated, providing a clear picture of the function's behavior.

xs <- seq(-6, 10, len = 1000)
plot(xs, f(xs), type="l", xlab="X", ylab=expression(1.4(x-2)^2 + 3.2))        

Executing Gradient Descent:

The main loop of the algorithm iteratively updates the value of x based on the gradient and learning rate. During each iteration, the current x and f(x) values are stored in the vectors xtrace and ftrace. The process continues until the change in f(x) falls below the specified threshold.

for (iter in 1:iterations) {
   x <- x - stepSize * grad(x) 
   xtrace <- c(xtrace, x) 
   ftrace <- c(ftrace, f(x)) 
   points(xtrace, ftrace, type="b", col="red", pch=1)
   if(iter > 1 && (abs(ftrace[iter] - ftrace[iter-1])) < threshold) break
}        

Analyzing Results:

The tracked values of x and f(x) during the iterations are stored in a data frame, 'df'. This allows for further analysis and visualization of the optimization process.

df = data.frame(x = xtrace, f = ftrace)        

Conclusion:

By implementing gradient descent in R, we gain a deeper understanding of the optimization process. The choice of learning rate, stopping criteria, and initialization parameters significantly impacts the algorithm's convergence and efficiency. Readers are encouraged to experiment with different learning rates and explore the effects on the optimization outcome. This hands-on approach provides a solid foundation for understanding and implementing gradient descent in various optimization scenarios.

The plot :

The whole code for usage :

# This R script defines two functions: `f` and `grad`.
# `f` calculates the value of the function 1.4 * (x-2)^2 + 3.2
# `grad` calculates the gradient of `f`
# Equation: Xnew = Xold - ?? ??f(Xold)
# where ?? is learning rate, and ??f(Xold) is gradient of function f with input Xold

f <- function(x) {
  1.4 * (x-2)^2 + 3.2
}
grad <- function(x){
  1.4*2*(x-2)
}

iterations <- 100 
threshold <- 1e-5

#learning rate 
stepSize <- 0.05 

# initialize x 
x <- -5

# initialize vectors to store x and f(x)
xtrace <- x 
ftrace <- f(x) 

# generate series of x values within some range
xs <- seq(-6,10,len=1000)
plot(xs , f(xs), type="l",xlab="X",ylab=expression(1.4(x-2)^2 +3.2))

for (iter in 1:iterations) {
  x <- x - stepSize*grad(x) 
  xtrace <- c(xtrace,x) 
  ftrace <- c(ftrace,f(x)) 
  points(xtrace , ftrace , type="b",col="red", pch=1)
  if(iter > 1 && (abs(ftrace[iter] - ftrace[iter-1])) < threshold) break
}
df = data.frame(x=xtrace,f=ftrace)        


要查看或添加评论,请登录

Salmane Koraichi的更多文章

社区洞察

其他会员也浏览了