Problem Solving vs Parameter Tuning

Problem Solving vs Parameter Tuning

When building machine learning models, it is very tempting to spend a lot of time optimizing the models by adjusting/fine-tuning hyper-parameters. Hyper-parameters are the knobs/switches. Hyper-parameter tuning is definitely an important step – as it sometimes gives a few percentage better results (better accuracy, better ROC/AUC, etc.,) over the default settings.

But an inordinate amount of focus is given to this tuning task and to applying multiple modeling techniques (KNNs, random forests, gradient boosting, deep neural nets) – each with its own set of knobs (gamma, learning rate, max depth, epochs, batch size, optimizer etc.,) to adjust. So much time is spent on wringing the last amount of performance that not enough time (i.e., no time at all) spent on revisiting the problem definition and revisiting the data collection. Unfortunately, very true when the model performance is mediocre/average. Instead of revisiting the problem that we are trying to solve, we tend to ‘improve’ the (bad) performance by a few percentage points.

There appears to be a false belief that by using a high population grid search space, the ‘best’ model can be achieved. This is far from the truth. The real best model is achieved by properly defining the problem question, capturing the data elements that inherently affect the outcome and then creating/engineering new features (i.e., variables like BMI that combine height & weight or replacing postal code with median home values or replacing state with unemployment rate etc.,) that are information laden. When the performance metrics are less than ideal, we should revisit the problem definition & data collection – hard tasks. Instead, through inertia (or escapism), a lot of time is spent on tuning/optimizing the model.

It always helps to remember when in doubt:

“>” – More important than

Problem definition > data identification > feature creation > modeling technique > hyper-parameter tuning

Finally, when current model performance is mediocre, we must recall Tukey's words: "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data".

So, focus on getting better data and more info-rich predictors. Good luck!

要查看或添加评论,请登录

Ganesh Krishnamurthy的更多文章

  • Machine Learning: Learning from the Masters

    Machine Learning: Learning from the Masters

    Some stories are timeless and the same holds true for certain technical articles. Re-reading this paper by Pedro…

    3 条评论
  • Myths about Machine Learning

    Myths about Machine Learning

    I saw this content written by one of my instructors on Udemy - Mike West. Mike's course on Deep Learning titled…

    1 条评论
  • (Beware of) Procrustean solutions in data science

    (Beware of) Procrustean solutions in data science

    From Wikipedia: A Procrustean solution is the undesirable practice of tailoring data to fit its container or some other…

社区洞察

其他会员也浏览了