Machine Learning Blog – 8
Mahtab Syed
Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle
Multi-Layer Stacking Ensemble and Optuna Hyperparameter Tuning
In this blog I will illustrate and link to the code of a Multi-Layer Stacking Ensemble or Model Stacking for a problem from Kaggle competition. It took me some time to get this concept, so I will explain using a simple diagram hoping it’s easier to get. And the best way as usual is to write code and try it.
?Input data and output predictions
1.???We are given a training set with few features (x) and label (y).
a.???Divide this training set to create training set xtrain, ytrain and validation set xvalid, yvalid
b.???Train a model using xtrain, ytrain and validate against xvalid, yvalid to get preds_valid
c.????To achieve good cross-validation the best way is to use folds which generate different xtrain, yrain and xvalid, yvalid in each fold
2.???And we are given a test set with no label (y)
a.???After we train and validate a model, we can predict the answers for the test set in test_preds
b.???And then submit the test_preds to the competition
3.???For a model and for a train set we need to optimize the Hyperparameters to get best results, for which I used Optuna Hyperparameter Tuning
?
Single-Layer Model – Github code
This is simple one model train using xtrain, ytrain and validate against xvalid, yvalid in 5 folds and then generate the test_preds and submit to competition
XGBRegressor gave best results
领英推荐
Multi-Layer Ensemble (Using multiple Models) – Github code
The concept is to train multiple models on exact same training data. The key is models should be sufficiently diverse which means one model will be good at some and not good at some data. If we Stack them it’s possible that the combined result of the 5 models (strong learner) will be better than one individual model (weak learner).
And on top of this we can have multiple layers where output of one layer serves as input for next.
Its best to use folds for cross validation and do Hyperparameter tuning at each layer
Layer 0 - 5 models : XGBRegressor, LGBMRegressor, CatBoostRegressor, RandomForestRegressor, LinearRegression
Layer 1 - 3 models : XGBRegressor, LGBMRegressor, CatBoostRegressor
Layer 2 - final model : XGBRegressor
And in every Layer I used Optuna Hyperparameter Tuning to get best results
TODO:
Mahtab Syed, Melbourne 20 Nov 2021
Acknowledgements:
Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle
3 年Thanks Abhishek Thakur and Aurélien Géron - I think I understand Model Stacking now. Need to find more diverse models to get better results.