登录查看更多内容

BAYESIAN OPTIMIZATION - A Hyperparameter Tuning Method

Saurav K.

Senior Software Engineer @Capgemini with 5 years in high-impact roles, including CERTA and TURMERIK, now focused on leveraging my expertise in data-driven solutions.

发布日期: 2024年9月4日

Hi All there!

I am Saurav Kumar pursuing a master's in Data Science at the University of Massachusetts and Ex Machine Learning Inter and Software Engineer.

Writing an article for the folks out there who all are struggling to find out the best approach for finding the best hypertuned parameter for your deep neural network.

LETS START!!!

BAYESIAN OPTIMIZATION

WHAT

Bayesian Optimization is a method for optimizing objective functions that are expensive and time-consuming to evaluate, such as those that take minutes or hours to compute. It is particularly useful when dealing with functions that have multiple parameters and you want to find the optimal set of parameter values within a specific range.

WHY

Apart from bayesian optimization there are two other common approaches for optimizing objective functions, particularly in the context of machine learning and hyperparameter tuning, which are Grid Search and Random Search.

But WHY bayesian optimization is effective because it strategically chooses the next set of parameters to evaluate based on previous results, concentrating on the most promising areas of the parameter space.

PROCESS

The process involves building a probabilistic model, typically a Gaussian Process, of the objective function. This model is then used to predict the function's behavior across the parameter space and identify the most promising areas to evaluate next. The goal is to balance exploration (trying out new parameter values) and exploitation (refining known good parameter values) to efficiently converge on the optimal solution.

PERFORM

Let's start performing the Bayesian optimization. I will be using the R-language to perform. The same can be done in Python also.

We will be using the rBayesianOptimization library. For reference, in Python, you can use the BayesianOptimization class, which is part of the bayesian-optimization package.

Install and Load

install.packages("rBayesianOptimization")

library(rBayesianOptimization)

We will explore this concept by applying it to deep neural networks. I have a pre-built model, a Convolutional Recurrent Neural Network (CRNN), which I developed for my project. You can apply this technique to your own model as well.

领英推荐

From scratch to XAI - A personal 1-week experience…

Candi CARRERA 1 年前

# Model architecture
model <- keras_model_sequential() %>%
  layer_embedding(input_dim = 1500,  
                  output_dim = 32, 
                  input_length = 300) %>%  
  layer_conv_1d(filters = 64, 
                kernel_size = 2,
                activation = "relu",
                strides = 2) %>%
  layer_max_pooling_1d(pool_size = 4) %>%
  layer_lstm(units = 128, activation = "tanh") %>%  
  layer_dense(units = 50, activation = "softmax") 

# Compile model
model %>% compile(optimizer = "adam",  
                  loss = "categorical_crossentropy",
                  metrics = c("acc"))

# Fitting the model
model_one <- model %>% fit(trainx, trainy,
                           epochs = 30,
                           batch_size = 32,
                           validation_data = list(validx, validy))

To optimize this model, we need to determine the best values for parameters such as filters, kernel_size, strides, and LSTM units. We will use Bayesian Optimization to find these optimal values. To facilitate this, we will convert our model into a function that returns a list containing the evaluation results.

cnn_lstm_cv <- function(lstm_units, conv_filters, kernel_size, strides_rate) {
  
  # Define the model architecture
  model <- keras_model_sequential() %>%
    layer_embedding(input_dim = 1500,  
                    output_dim = 32, 
                    input_length = 400) %>%  
    layer_conv_1d(filters = as.integer(conv_filters), 
                  kernel_size = as.integer(kernel_size), 
                  activation = "relu",
                  strides = strides_rate) %>%
    layer_max_pooling_1d(pool_size = 4) %>%
    layer_lstm(units = as.integer(lstm_units), activation = "tanh") %>%
    layer_dense(units = 50, activation = "softmax")
  
  # Compile the model
  model %>% compile(optimizer = "adam",  
                    loss = "categorical_crossentropy",
                    metrics = c("acc"))
  
  # Fit the model
  history <- model %>% fit(
    trainx, trainy,
    epochs = 10,  
 optimization
    batch_size = 32,
    validation_data = list(validx, validy),
    verbose = 0
  )
  
  # Evaluate the model on validation data
  score <- model %>% evaluate(validx, validy, verbose = 0)
  
  # Return the accuracy as the metric to maximize
  return(list(Score = score[[2]], Pred = score[[2]]))
  
}

Next, we will define a list specifying the limits within which the Bayesian Optimization method will search for the optimal values of our parameters.

bounds <- list(
  lstm_units = c(32L, 128L),
  conv_filters = c(64L, 256L),
  kernel_size = c(3L, 10L),
  strides_rate = c(1L, 5L)
)

Now, we will call the function to initiate the Bayesian Optimization process.

set.seed(520)
opt_result <- BayesianOptimization(
  FUN = cnn_lstm_cv,
  bounds = bounds,
  init_points = 10,  # Number of random starting points
  n_iter = 30,  # Number of iterations for optimization
  acq = "ucb",  # Acquisition function
  kappa = 2.576,
  verbose = TRUE
)

Finally, we will print the best parameters, which are stored in opt_result, and use them to build and train our model on the dataset.

print(opt_result$Best_Par)

In my case, I got the below result.

CONCLUSION

Bayesian Optimization provides a powerful and efficient approach for hyperparameter tuning, particularly when dealing with complex models and expensive evaluations. By strategically selecting parameter values based on previous evaluations, Bayesian Optimization reduces the number of trials needed to find optimal settings, making it a valuable tool for enhancing model performance. In this article, we've demonstrated how to apply Bayesian Optimization to tune a Convolutional Recurrent Neural Network (CRNN) using the rBayesianOptimization library in R. The same principles can be applied in Python with the bayesian-optimization package. This method not only streamlines the optimization process but also helps achieve better results by focusing on the most promising regions of the parameter space. By leveraging Bayesian Optimization, you can efficiently enhance your models and improve their performance with fewer resources.

REFERENCE

I hope this article has provided a clear understanding of how to leverage Bayesian Optimization for hyperparameter tuning in deep neural networks. If you have any suggestions, questions or need further details on implementing these techniques, please feel free to reach out. I'm here to help and discuss more on this topic!

Rohit Parihar

Seeking Data Roles | Master's in Data Science | SQL | Python | AWS | ML

2 个月

Insightful

1 次回应

Soufiane Hamdaoui

I coach leaders to free up 20+ hours a month and implement systems that boost team performance by 20%

Insightful guide bridging theory and practical application seamlessly.

查看更多评论

BAYESIAN OPTIMIZATION - A Hyperparameter Tuning Method

Saurav K.

Senior Software Engineer @Capgemini with 5 years in high-impact roles, including CERTA and TURMERIK, now focused on leveraging my expertise in data-driven solutions.

LETS START!!!

BAYESIAN OPTIMIZATION

WHAT

WHY

PROCESS

PERFORM

领英推荐

CONCLUSION

REFERENCE

更多精彩文章

社区洞察

其他会员也浏览了

The misguided intuition I had to unlearn to come to grips with modern machine learning

The Math Behind the Foundation of AI

Decoding the Transformers: A Dive into GPT with TensorFlow

August 04, 2022

Statistical Modeling to be used for such Numerical Predictions:

How do data science projects work?

Symbolic Regression: Bridging Interpretability and Complexity in Machine Learning

Binary Classification in Neural Networks with Tensorflow

Generating Realistic Data with PyTorch: Exploring the Power of GANs

LETS START!!!

BAYESIAN OPTIMIZATION

WHAT

WHY

PROCESS

PERFORM

领英推荐

CONCLUSION

REFERENCE

PRINCIPAL COMPONENT ANALYSIS - Simplifying Data with PCA

2024年10月14日

TIME SERIES FORECASTING APPROACH

2024年1月12日

社区洞察

其他会员也浏览了

The misguided intuition I had to unlearn to come to grips with modern machine learning

The Math Behind the Foundation of AI

Decoding the Transformers: A Dive into GPT with TensorFlow

August 04, 2022

Statistical Modeling to be used for such Numerical Predictions:

How do data science projects work?

Symbolic Regression: Bridging Interpretability and Complexity in Machine Learning

Binary Classification in Neural Networks with Tensorflow

Generating Realistic Data with PyTorch: Exploring the Power of GANs