Data Optimizations Techniques in the Machine Learning

Data Optimizations Techniques in the Machine Learning


Curve fitting is one of the most theoretically challenging parts of machine learning, primarily due to how important it is to the end result. While it might not pose a challenge when working with relatively simple datasets with a few features, in more complicated projects an improper fit is much more likely.

Consider that we have collected sensor, motor and joint data from the problem domain with inputs and outputs.

The x-axis is the independent variable or the input to the function.

The y-axis is the dependent variable or the output of the function.


We don’t know the form of the function that maps examples of inputs to outputs, but we suspect that we can approximate the function with a standard function form.

Curve fitting involves first defining the functional form of the mapping function (also called the?basis function?or objective function), then searching for the parameters to the function that result in the minimum error.

Error is calculated by using the observations from the domain and passing the inputs to our candidate mapping function and calculating the output, then comparing the calculated output to the observed output.

Once fit, we can use the mapping function to interpolate or extrapolate new points in the domain. It is common to run a sequence of input values through the mapping function to calculate a sequence of outputs, then create a line plot of the result to show how output varies with input and how well the line fits the observed points.

The key to curve fitting is the form of the mapping function.


Four scenarios

All curve fitting (for machine learning, at least) can be separated into four categories based on the?a priori?knowledge about the problem at hand:


  1. Completely known. There is no fitting problem to be had as, if f(x) is known, then it can be applied without any guessing. All future data will fall onto the curve neatly.
  2. Unknown, but the structure is known. In such a case, the curve may be known as, for example, a straight line, but no data on other parameters is available.
  3. Unknown, but can be guessed. In two-dimensional data, sometimes we have nothing, but since the outlay is relatively simple, we can make a reasonable assumption of what the curve should be.
  4. Unknown. Model function f(x) is completely unknown, there are no guesses to be made, and the parameters are mysterious.

A polynomial regression model by adding squared terms to the objective function.

No alt text provided for this image
No alt text provided for this image

A fifth-degree polynomial fit to the data.

No alt text provided for this image
No alt text provided for this image

A Curve fitting with Sine functions

No alt text provided for this image
No alt text provided for this image



Underfitting and Overfitting

First, curve fitting is an optimization problem. Each time the goal is to find a curve that?properly?matches the data set. There are two ways of improperly doing it — underfitting and overfitting.

Underfitting is easier to grasp for nearly everyone. It happens whenever the function barely captures the complexity of the distribution of data in, say, a scatter plot. As is often the case, these are easiest to visualize in two dimensions, but curve fitting often has to be done in more.

The problem with underfitting is quite clear. A model with such a curve will make erroneous predictions because it attempts to simplify everything to a significant degree. It might, for example, capture just a few data points out of dozens.

Overfitting is a bit more complicated. Intuitively it may seem that you’d like to maximize the accuracy of a model by fitting the curve perfectly. In the real world, overfitting causes numerous errors to appear when testing the model.

There are many potential ways to understand why overfitting is an issue. One is to think of any dataset as incomplete. Unless you acquire all existing data points, there will be some unknowns that will have some predictable, but not identical distribution. An overfitted model would have learned the patterns so well that it would expect them to be identical in the future.

In the end, one might think of overfitting as bringing the model closer to determinism instead of leaving it stochastic. Proper fit is somewhere in between underfitting and overfitting.

要查看或添加评论,请登录

Rama Krishna Reddy Dyava的更多文章

社区洞察

其他会员也浏览了