登录查看更多内容

Static models in a rapidly changing dynamic world

pradeep ponduri

Develops on AWS...... Data Engineer @Amazon

发布日期: 2021年8月2日

We always develop a machine learning solution to solve real-life problems. The data that we use to train the models is nothing but static numerical or categorical values. Measuring the model performance and deploying makes us an achiever. But the problems start when the client is not happy with predictions and we start figuring out the root problem. In my data journey, the first rule I learned is to "Never Trust Data". We build models considering the static data and forget that the data changes constantly. Using historical data for model building is good but this historical data becomes outdated at some point and our models still believe certain features are still having a higher priority. This story is about how models that we built stay static when things in real-life change and what steps we can take to make our ml models consistent.

The pressing question we have is are we keeping track of data changes and model performance. The upstream data changes quickly and our model is not built to consider these changes. So consistent data checks should be done on upstream data before feeding into the ml model. Some of the checks I would do are making sure the distribution is consistent, the trend and pattern follows the historical data (that was used for training the model). Distribution similarity checks will help in knowing about changes and we can trigger an alarm if the similarity score is less.

领英推荐

10 Easy Ways to Boost the Performance of your AI System

Vincent Granville 6 个月前

Comparison of Dimensionality Reduction Methods

Yair Galili 7 个月前

Enhancing Model Performance: The Role of…

AIBrilliance 4 天前

We should also validate model performance in regular intervals and retrain it when needed. Having a real-time monitoring dashboard for tracking the data checks is useful and easy to notify when certain thresholds change. If the model is given a data point that it hasn’t seen before will result in wrongful predictions.?

Comment below your thoughts on how ml model monitoring can be done.

要查看或添加评论，请登录

pradeep ponduri的更多文章

Optimize your Spark Jobs

2022年5月16日

Optimize your Spark Jobs

As the volume of data increases, we always find bottlenecks dealing with it. Although spark has its own catalyst to…
Big Data Storage Formats

2021年8月11日

Big Data Storage Formats

An important task of any platform that processes big data is to decide on the type of format to store data. Hadoop has…
Concurrent Read Write Capability

2021年8月9日

Concurrent Read Write Capability

In the previous post, we have seen how transaction logs keep track of commits in delta lake. Now let’s talk about…
Data skipping and zorder in delta

2021年8月7日

Data skipping and zorder in delta

In this post, we take a look at how delta under the hood is capable of sifting through petabytes of data within…
Transaction Logs in Delta Lake

2021年8月6日

Transaction Logs in Delta Lake

Understanding the transaction log in Delta Lake is key in understanding the concept of the delta. This log is…

3 条评论
Data Lifecycle to Delta Lake Lifecycle

2021年8月5日

Data Lifecycle to Delta Lake Lifecycle

We’re always told to ‘Go for the Gold!’ but how do we get that? This article is about how data can be moved in stages…
Delta Lake To Prevent Data Corruption

2021年8月4日

Delta Lake To Prevent Data Corruption

Delta lake or simply Delta is my go-to big data storage format these days. Storage formats are continuously evolving…
Blockchain - As I See It

2021年1月20日

Blockchain - As I See It

Block chain is a technology that enables moving digital coins or assets from one place/individual to other. The terms…

1 条评论
Neural Learning with Tensorflow2.0 Part-3 ( Tensorflow Model Graph in Neo4j and Linkurious)

2020年2月3日

Neural Learning with Tensorflow2.0 Part-3 ( Tensorflow Model Graph in Neo4j and Linkurious)

In Part-2 of Neural Learning, we built a simple model for computing sum of two numbers. In this part we will using…
Neural Learning with Tensorflow2.0 Part-2 (Overview of Gradient Descent and building simple model with Tensorflow)

2020年2月3日

Neural Learning with Tensorflow2.0 Part-2 (Overview of Gradient Descent and building simple model with Tensorflow)

In Part1 we have seen basics of Neural networks, how perceptron model and multi-layer perceptron model can be…

See all articles

Static models in a rapidly changing dynamic world

pradeep ponduri

Develops on AWS...... Data Engineer @Amazon

领英推荐

pradeep ponduri的更多文章

社区洞察

其他会员也浏览了

AI_Part_4_What is K-fold Cross Validation?

Data Optimizations Techniques in the Machine Learning

How do I determine which evaluation metric is most appropriate for my specific machine learning task?

A Brief Guide to Price Elasticity Modeling

Why is it called Support Vector Machine(SVM)?

K-Fold Cross-Validation Explained: The Key to Accurate and Robust Model Evaluation

BIAS vs VARIANCE - Quick Intro

Bias and Variance and Its Trade Off

Grad Descent, GDWM, RMSProp & Adam Optimizers

领英推荐

pradeep ponduri的更多文章

Optimize your Spark Jobs

Big Data Storage Formats

Concurrent Read Write Capability

Data skipping and zorder in delta

Transaction Logs in Delta Lake

Data Lifecycle to Delta Lake Lifecycle

Delta Lake To Prevent Data Corruption

Blockchain - As I See It

Neural Learning with Tensorflow2.0 Part-3 ( Tensorflow Model Graph in Neo4j and Linkurious)

Neural Learning with Tensorflow2.0 Part-2 (Overview of Gradient Descent and building simple model with Tensorflow)

社区洞察

其他会员也浏览了

AI_Part_4_What is K-fold Cross Validation?

Data Optimizations Techniques in the Machine Learning

How do I determine which evaluation metric is most appropriate for my specific machine learning task?

A Brief Guide to Price Elasticity Modeling

Why is it called Support Vector Machine(SVM)?

K-Fold Cross-Validation Explained: The Key to Accurate and Robust Model Evaluation

BIAS vs VARIANCE - Quick Intro

Bias and Variance and Its Trade Off

Grad Descent, GDWM, RMSProp & Adam Optimizers