Logistic regression introduction
The link between logistic regression and sigmoid function:
Consider an example where we want to classify between cats and dogs. Let’s say we have 2 features: the whisker length and ear flappiness index. We want to find the best straight line which separates between cats and dogs based on these 2 features.
ML practitioners started doing linear classification using something called “zero-one” hypothesis (also called the sign or signum function).
It looks like this:
It turns out that if we use the sign function, it is mathematically impossible to solve the optimisation problem of finding the best straight line.
Why? Because the sign function is not differentiable. It has a jump or a shock, which does not have good mathematical properties.?
The solution?
Smoothen it out.
What if our hypothesis class looks like this:
It turns out that this smoothening act drastically improves the optimization process, and we can indeed find the best straight line.
This is why sigmoid function is so commonly used in logistic regression.
Watch the full video in Day 15 of the ML: Teach by Doing project here:
Stay tuned for Day 16!