Quick Reads: My Journey into Machine Learning & Hyperparameter Optimisation
Charles Glah
Founder | Dynamic Machine Learning Researcher | Product Management Expert | Senior Leader in Asset Management | CFA UK
Introduction
When I first started my journey into machine learning over 10 years ago, I thought it was all about feeding data into an algorithm and watching it work its magic. I quickly learned that ML is far from magic—it’s a structured, rigorous process full of challenges, breakthroughs, and constant learning. One of the toughest yet most fascinating aspects I encountered was hyperparameter optimisation—the art of fine-tuning an algorithm for peak performance.
Let me take you through my journey, the struggles I faced, and the parts that kept me hooked. I’ll also share how advanced hyperparameter optimisation techniques helped me define and implement a machine learning innovation strategy.
The Learning Curve: Understanding the Machine Learning Pipeline
I started by building simple models, eager to predict stock prices. I thought having good data was enough—big mistake. The first hurdle? Data Preprocessing.
1. Data Collection & Preprocessing: The Unexpected Hurdle
Stock price data seemed straightforward, but soon I realised it was messy. Missing values, inconsistent formats, and outliers threw my models off. I had to learn how to:
Once I overcame this, I moved on to something even trickier—feature engineering.
2. Feature Engineering: The Art & Science of Picking the Right Inputs
Not all data points are useful. I had to dig deep into which features actually mattered. For predicting stock prices, I tested everything:
After a lot of trial and error, I learned that good features make or break a model. But even after selecting the best features, my models still underperformed. That’s when I discovered hyperparameters.
The Turning Point: Mastering Hyperparameter Optimisation
When I first heard about hyperparameters, I thought, Do they really make that much of a difference? The answer: Absolutely!
3. Choosing the Right Model & Facing Hyperparameter Hell
I started with Random Forest, an algorithm well-suited for stock predictions. But it had settings I had never seen before:
I initially guessed values and hoped for the best. That was a disaster. My model swung between overfitting and underfitting with every minor tweak.
4. Hyperparameter Optimisation: The Game-Changer
I finally stopped guessing and used Grid Search and Random Search to test different hyperparameter combinations. That was a turning point! But things really got exciting when I discovered Bayesian Optimisation—a method that learns from past performance to find better parameters faster.
This was the moment where I fell in love with ML—the thrill of iterating, testing, and improving.
领英推荐
Python Code: Implementing Randomised Search for Hyperparameter Tuning
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
# Define model and hyperparameter grid
model = RandomForestClassifier()
param_dist = {
"n_estimators": randint(10, 200),
"max_depth": randint(3, 20),
"min_samples_split": randint(2, 10)
}
# Perform Randomized Search
random_search = RandomizedSearchCV(model, param_dist, n_iter=50, cv=5, scoring="f1", n_jobs=-1)
random_search.fit(X_train, y_train)
# Print best parameters
print("Best Hyperparameters:", random_search.best_params_)
Taking It to the Next Level: Advanced Hyperparameter Optimisation
Below is a comparison table ranking hyperparameter optimisation techniques from simple to complex:
This table provides a structured way to compare techniques based on complexity and ideal use cases.
While Bayesian Optimisation worked well, I needed even more efficiency for large-scale models. That’s when I explored:
1. Hyperband
A faster alternative to Grid Search, Hyperband allocates resources dynamically, focusing on promising hyperparameters early and discarding weak ones.
2. Genetic Algorithms
Inspired by evolution, Genetic Algorithms optimise hyperparameters through mutation, crossover, and selection, evolving the best configurations over time.
3. Tree-structured Parzen Estimator (TPE)
Unlike Bayesian Optimisation, TPE models the probability of good hyperparameters and continuously refines them.
Final Thoughts: What I Wish I Knew Earlier
If I could go back and give my beginner self some advice, it would be:
So if you’re struggling with ML, keep pushing. The breakthroughs are worth it. And remember, optimisation never stops.
Want to hear more about my ML journey? Stay tuned for the next edition of Quick Reads! ??