Don't Settle for Mediocre: Optimize Your Models with GridSearchCV - Part 1
Karthik Sai Twarakavi (MSBA)
Data Analyst | Masters - Business Analytics | Seeking Full-Time Analyst role | Microsoft Certified Data Analyst Associate | Tableau Desktop Specialist | AWS Certified Cloud Practitioner | SQL, Python, Excel, Power-BI
Ever wonder how machine learning models achieve peak performance? While the algorithms themselves play a crucial role, the secret sauce often lies in hyperparameter tuning which is the process of finding the optimal values for the settings (hyperparameters) that control the learning process of the model.
OK! What are Hyperparameters?
Imagine you're a gym freak trying to maximize your gains. The machine learning algorithm is your workout routine, like a specific program of exercises. But just like in the gym, there are key factors like Sets and reps, Diet, sleep, and Weights that are beyond the routine that impact your results: the Hyperparameters.
Although there are many techniques in hyperparameter tuning, the GridsearchCV stands out from all.
Wait! What? Why was it named GridSearchCV? Let’s break it down.
GridSearch – It uses a grid combination of hyperparameters.
Imagine you're building a Random Forest model with two hyperparameters (We discussed this above)
Now the Grid Search would construct a grid with these possible combinations:
N_estimators???????????????????????? max_depth
? ??? 100???????????????????????????????????? ????? 3
???? 100???????????????????????????????????? ????? 5
???? 100???????????????????????????????????? ?? ???7
???? 200???????????????????????????????????? ????? 3
???? 200???????????????????????????????????? ????? 5
???? 200???????????????????????????????????? ????? 7
???? 300???????????????????????????????????? ????? 3
???? 300???????????????????????????????????? ????? 5
???? 300???????????????????????????????????? ????? 7
A unique combination of Hyperparameters.
Cross Validation (CV) – It’s a technique that involves dividing the data into multiple folds, using some folds for training and others for evaluating the model's performance. This helps prevent overfitting and leads to a more reliable assessment of the hyperparameters' effectiveness.
Let’s imagine we have data and divide it into 5 folds (like 5 equal pieces of eggless cake – I’m a Vegetarian ??).
Now the Grid Search CV would evaluate for the First combination (n_estimators = 100, max_depth = 3)
Holding Fold 1: Train a model with these settings on folds 2, 3, 4, and 5
Evaluate its performance on Fold 1.
Holding Fold 2: Train a model on folds 1, 3, 4, and 5.
Evaluate its performance on Fold 2.
This Repeats for Folds 3, 4, and 5.
Calculates the average performance across all 5 folds. Then it repeats the same process for the rest of the Hyperparameter combinations.
So, how’s the winning hyperparameter setting decided? - whichever combination scores the highest average performance, as simple as that.
领英推荐
Before using GridSearchCV, Data scientists used Manual tuning of different hyperparameter values, observing model performance, and iteratively adjusting them which is Laborious and time-consuming. Later on, started to use Grid search without cross-validation which is prone to over-fitting. All these limitations were addressed by the implementation of GridSearchCV.
Why does GridSearchCV Take Time?
It depends on multiple factors.
Uses & Benefits of GridSearchCV
Uses:
Benefits:
When to Use GridSearchCV?
Limitations and When Not to Use GridSearchCV
As every hero has weaknesses, so does our GridSearchCV.
Disadvantages:
Wrapping Up
That's it for our journey into GridSearchCV! We've seen how it's a key player in tuning our machine learning models, despite its quirks and challenges. Up next, I'll compare it with other tuning methods and dive into some real-world stories.
Stay tuned, and feel free to share your own experiences or questions about GridSearchCV in the comments. Let's keep the learning going!
Looking forward to Part 2! ??
-
8 个月Looking forward to the next part! ??
IT QA Intern @ Fulton Bank | Red Hat Certified
8 个月Great article Karthik Sai Twarakavi !
Looking forward to reading it! ??
Functional Consultant @ Deloitte | Oracle SCM, Discrete Manufacturing, WMS
8 个月Super...Inka taggede le??