登录查看更多内容

A continuously learning ML/Neural_Network model, Is it possible?

Ajay D

SDET-2 @ Zeta || ex - Glance Inmobi || Ajio | Bigbasket

发布日期: 2021年1月22日

So after some long hours of back-testing a model on a stream of real-time data, something interesting happened.

The model's accuracy of prediction was pristine but didn't last more than an hour or two. I thought to myself that there might be some problem with the data that was being fed into the model and "YES, Garbage In -> Garbage Out". I started investigating on this phenomenon and I stumbled upon a concept called "Concept Drift" which explains this phenomenon.

It does describe my problem and I accidentally stumbled upon a whole new research world which is trying to solve this problem. You see this phenomenon called "Concept Drift" occurs when the model is fed with inputs having totally different characteristics from which the model got trained on and it is an inevitable thing that happens to any model that you deploy in a real-life scenario or atleast its just my thoughts (Ps: I am an undergrad student with the knowledge of implementing ML/Neural_Nets ideas from reading a lot of blog posts, websites and books). There might be a way to overcome this scenario and companies may use it while deployment but after hours and hours of Googling I couldn't find any article that could help me solve the problem.

I was using an #LSTM and never in my life wished that such an advanced system should be having an "Adaptive Learning" feature which can be turned on/off. You see, once the model is trained, everything is set and there is NO MORE LEARNING while it is predicting something which is not really desirable in an ever changing environment. A baby might like Vanilla flavor and despise Chocolate today but this behavior might change within a span of month or a week too. We can't be having model that requires constant pampering after it is deployed and left in the wild.

So I thought of making a model that keeps learning while simultaneously making predictions. But there is a trivial problem here.

The MinMax Scaler that I used for training the model will have its min and max parameters set from the data that I used to train the model but

"WHAT IF THE SCALER HAS TO SCALE SOMETHING WHICH HAS A RANGE THAT IS DIFFERENT FROM THE ONE IT WAS FITTED WITH??"

For example say, the scaler used for training the model had its min and max parameters set to 0 and 20 respectively and now the set the values that the scaler has to perform its operations is having a range of -5 and 25. In this scenario the scaler will definitely not perform as intended to perform and hence leading to a Garbage input.

The above formula for MinMax Scaler proves my point. Now you may ask, "IS IT TRUE FOR ALL THE SCALERS?" and my answer is Yes, if you were to work on a challenging datasets which involves Time-series Regression.

Ok now let's imagine we were making a model for making smart predictions on Stock prices and now we have a stream of data entering our model. Our model performs with the same accuracy at the start as intended to but after a few hours, the accuracy drops and now you have ended up with the huge losses due to degrading accuracy.

What might been the problem here? Can it be the varying data range? Can it be the varying standard deviation and variance? Can it be the varying trends or the varying seasonality or is it totally something else. It can be anything but the point here is the model is performing poorly and we need to swap it with another model which is not ideal after deployment.

Why not have a model that is simultaneously getting trained from the stream of data which is given for prediction? Yes that can be a good solution, but making a model which can adaptively change weights during prediction has its owns complications. Parallel processing can be an answer but it requires we as a programmer to time the threads accurately which is daunting if we take the unknown variable latency into account. Also what about the data points that we missed in real-time while training our model and before starting the prediction phase

Can we trick the model to give out predictions by indefinitely keeping it in a training state and try extracting its raw outputs from the last layer manually? Well that might be a possible way but I am no good in that without prior experience in doing so.

These are only a few challenges that I could come up as a Fresher and these questions might be silly but hey everybody asks a silly question now and then out of pure curiosity during the learning phase and the companies that are solely built on providing ML solutions might hold the key for my questions, but for now I feel like a clueless idiot even though I have built 10+ ML models and solved a variety of problems and these are few questions that the 1000+ blogspots/websites/youtube that teach beginners on how to code ML/ANN/DNN don't touch upon.

Anyways I guess this post has become too long and if you have reached till here Thank you and if you feel like I would be a good candidate for some discussions such as this one, do feel free to reach out.

BYE

要查看或添加评论，请登录

Ajay D的更多文章

I too made a Singly Linked List on RUST FINALLY!!!

2022年4月23日

I too made a Singly Linked List on RUST FINALLY!!!

Yes you read the title right and I am going to be explaining about how I implemented my first linked list in rust and…
THE NEMESIS AI. The real deal

2021年11月14日

THE NEMESIS AI. The real deal

Imagine a game where you get to choose your own story. Imagine a game that can provide each player with their own…
PREDICTING THE UNPREDICTABLE

2021年2月15日

PREDICTING THE UNPREDICTABLE

Look at the above image. It is artistic and intriguing but what if I told you that this image was made by the plotting…

A continuously learning ML/Neural_Network model, Is it possible?

Ajay D

SDET-2 @ Zeta || ex - Glance Inmobi || Ajio | Bigbasket

Ajay D的更多文章

社区洞察

其他会员也浏览了

Gradient Boosting and XGBoost explained in detail

Machine Learning Journey

Learning Machine Learning Concepts and a today story

The No Free Lunch Theorem

Machine Learning: A Comprehensive Guide for Beginners

How Machine Learning has transformed the Way We Live Today?

Introduction to Gradient Boosting Machines (GBM): A Powerful Ensemble Technique

Transfer Learning vs Fine Tuning - Top 10 Differences Between Them

Important Machine Learning Terminology

Why Stochastic Gradient Descent (SGD) is a Game Changer?

Ajay D的更多文章

I too made a Singly Linked List on RUST FINALLY!!!

THE NEMESIS AI. The real deal

PREDICTING THE UNPREDICTABLE

社区洞察

其他会员也浏览了

Gradient Boosting and XGBoost explained in detail

Machine Learning Journey

Learning Machine Learning Concepts and a today story

The No Free Lunch Theorem

Machine Learning: A Comprehensive Guide for Beginners

How Machine Learning has transformed the Way We Live Today?

Introduction to Gradient Boosting Machines (GBM): A Powerful Ensemble Technique

Transfer Learning vs Fine Tuning - Top 10 Differences Between Them

Important Machine Learning Terminology

Why Stochastic Gradient Descent (SGD) is a Game Changer?