Bayesian Optimization
Source: Photo derived from Shakrim (https://blog.shakirm.com/2015/10/bayesian-reasoning-and-deep-learning/)

Bayesian Optimization

In machine learning (ML), one of the interesting and, at the same time, difficult tasks is Bayesian optimization. The Bayesian optimization is utilized in machine learning hyperparameter optimization and noted that hyperparameters are ML parameters in which values are selected prior to training, i.e., learning rate for training a neural network and values of k in the k-nearest neighborhood.

The motive for the Bayesian optimization is to identify the global minimum in a function f with a very large list of parameters where the computational cost is high. It is significant to utilize the ML algorithm to analyze which derivates are unknown.

Bayesian optimization has been restricted in many ways; the most important of them include the following:

  • High computation cost
  • The derivative is unknown. In ML and deep learning algorithms, gradient descent and its different variants are common uses. Of course, knowing the derivative can give the optimizer a direction where in this case, it is unknown.
  • The task is to find a global minimum which is not always easy to conduct the process and obtain global minimum, even in complex methods such as gradient descent. The model to optimize the Bayesian is needed to perform in a way that prohibits the mechanism from being trapped in a local minimum.

Solving the problem of finding a global minimum with fewer steps is given by the Bayesian optimization framework. To understand better, we need to know how the Bayesian nature is? In Bayesian statistics and modeling, the essence is associated with new information prior to previous beliefs and then updated posterior as post-beliefs are obtained. It involves statistical methods that assign probabilities to events based on the experience or the best guess before the experimentation and data collection and then implementing Bayes' theorem to obtain the possibilities.

In short, for optimization, we select the acquisition function and evaluate the sample with the objective function, then update the data and in turn, the surrogate function. So Bayesian optimization builds a probability model from the objective function and utilizes it to choose a hyperparameter with the purpose of evaluating the true objective function.

The question arises when we should use Bayesian optimization. It is suggested to apply Bayesian optimization for the objective function, which is expensive to assess; most commonly, it is used in hyper parameter tuning. There exist libraries such as HyperOpt for this purpose.

Resource: Bayesian Optimization Algorithm

要查看或添加评论,请登录

Shaghik Amirian的更多文章

社区洞察

其他会员也浏览了