Resisting the urge to overfit

I drive back via NH8 from Gurgaon to Delhi on a daily basis. For folks familiar with the drive, the entry to Delhi opposite Ambience Mall is fairly congested. As a commuter, you can either take the service lane or the highway and you can use google maps for the same. The problem with google maps is that it is often wrong during congestion and can't be followed blindly. For the first few days during the drive, I did what many CS folks do - took both the paths on alternate days and observed which is faster. Once the exploration phase was over, I now stick to the service lane before Ambience mall and then cut into the highway just at the entry point. I also make it a point to observe if the overall behavior is drifting away from my original hypothesis. However, there are days, when I get this urge to deviate from the "optimal" strategy. It could be a little bit of extra congestion at the entry to the highway or something else; signals that are not strongly correlated with the drive through time but do have a weak correlation.

I am generally considered as a very logical person but even I find it very difficult to resist this urge; adding one additional attribute to the model; solve that one extra false negative.

In my experience, there are two distinct phases of building any data science model. The first phase or the lab phase is one where you collect the data to build your model. In this phase, you go out and talk to everyone under the sun trying to get as representative a data as possible. Once you have the data, you disappear in a lab and come out only after you have a model and all sorts of metrics - precision, recall, F1, confusion matrix, in hand. You have a smug smile on your face as throw these metrics to your business stakeholders and want to get to the real world "deployment phase".

Of course, no one on the business side understands any of these metrics. So, they do what they understand. They call someone in the field and ask them to send you 10 data points from the field. You run the data through the model and voila; the model is correct 9 out of 10 times. You claim success, again; see the precision is 90%.

The business exec however looks glumly at the one sample that you missed. Can we solve for that? Even better, they already have an answer - if you just put a condition on attribute X > Y, you get what you need. That is what you need to change in your algorithm. It may not even matter if you were using a deep learning model that already included X (and there is no "algorithm" to change) or a stochastic decision tree that leads to higher accuracy for different thresholds than the Y that works in this false negative. This negative sample is all that matters at that point in the discussion.

This is where it is absolutely important to hold ground. It is important to get the discussion back on the data set and the metric you are targeting. If the data set was not representative, then it needs to get fixed. If the precision of 80% is not good enough, we need to take it up to 90%. However, under no circumstances, can you guarantee that it will work with that one negative sample. The only goal that you can take back is to have a high precision (or whatever metric makes sense for the business) on a real world test set.

Often enough, I have seen data scientists/engineers come back and tweak the model to somehow fit that particular sample. This is sometimes done by choosing model parameters that will make that sample work; at other times by including that sample in the training set and picking a deeper network that will ensure a higher match for the training set. It is much simpler to just change enough in your model to get that offending data set out of the way (at a cost of over-fitting and lower precision in the real world) than to convince others that they need to change the way they think; look at aggregates and not individual samples for assessing if a model is working.

This urge to overfit is common to us all. But, take a deep breath and let that one sample fail. Never forget that a model that always works in training does not learn; it only memorizes.

Zafar Ahmed Ansari

Engineer ? Philosopher ! Writer ?

2 å¹´

It all goes back to the first principles. Do the right thing, period.

赞
回复

要查看或添加评论,请登录

Akshat Verma的更多文章

  • The MVP conundrum

    The MVP conundrum

    What should be the scope of our MVP? This is probably one of the topics that is the most debated, while building a…

    2 条评论
  • Layoffs, hiring and staying true to yourself

    Layoffs, hiring and staying true to yourself

    This has been the season of layoffs - the worst we have seen in a while. A network like Linkedin has been useful as…

    6 条评论
  • Can you be an excellent architect without product management skills?

    Can you be an excellent architect without product management skills?

    One very welcome change I have seen in India is the rising trend of engineers aiming to stay technical for a longer…

    3 条评论
  • What a head-of-engineering is meant to do?

    What a head-of-engineering is meant to do?

    Recently, a person who was earlier part of my team and is transitioning to a head of engineering role asked what a…

    4 条评论
  • Visual Maps - the key to solving complex problems

    Visual Maps - the key to solving complex problems

    Visual learning has been in vogue the past few years - learning via images and videos. The logic for visual learning is…

    1 条评论
  • The real cost of feature sprawl

    The real cost of feature sprawl

    I was going through this interesting post from Ashish Kashyap on how productivity reduces when we have more people in a…

    9 条评论
  • Why "First Principles" thinking is so hard

    Why "First Principles" thinking is so hard

    "First principles" is quite in vogue these days. One can find a lot of articles that extol the virtues of "first…

    7 条评论
  • why your tech architecture needs to adapt to your team team

    why your tech architecture needs to adapt to your team team

    Many many years back, I was taking an interview for a principal engineer position. As is the norm in all my senior…

    3 条评论
  • The fine line between solving a painpoint and being creepy

    The fine line between solving a painpoint and being creepy

    Being a product manager is a hard job; being a product manager at a cross-sell company is even harder. A cross-sell…

    4 条评论
  • When vendor lock-in makes sense

    When vendor lock-in makes sense

    Flexibility - it is one of the most important aspects that we care about when evaluating technologies. Flexibility is…

    1 条评论

社区洞察