Why do you need to normalize your data in Machine Learning?
Md Rasel Khondokar
Md Rasel Khondokar
Ph.D. Student | Natural Language Processing | LLM | RAG | Machine Learning | Computer Vision | Team Lead
What is normalization?
Normalization is a process where values shifted between 0 to 1 range.
Why normalization is needed?
- If one feature in the dataset is larger in scale than the other, then this large-sized feature dominates during predictions.
- For NN, If the values are too high, the calculation takes a lot of time as well as memory. The same thing happens during backpropagation. As a result, the model converges slowly if the inputs are not generalized.
Where Normalization is very important?
- K-Means
- K-Nearest-Neighbours
- Principal Component Analysis (PCA)
- Gradient Descent
When Should You Use Normalization?
- If we don't know the data distribution
- If the distribution is not Gaussian/bell curve
- When your algorithm does not make assumptions about the distribution of your data, such as k-nearest neighbors and artificial neural networks.
Software Engineer | AI & DevOps Enthusiast
4 年Really helpful