登录查看更多内容

Incremental learning - The Live Machine learning model training Approach

Ashish Patel ????

Sr AWS AI ML Solution Architect at IBM | Generative AI Expert Strategist | Author Hands-on Time Series Analytics with Python | IBM Quantum ML Certified | 12+ Years in AI | IIMA | 100k+Followers | 6x LinkedIn Top Voice |

发布日期: 2019年12月26日

“Why data is so essential to each industry?”

Nowadays, It can be noticed that people are producing such an amount of data on different social media such as Facebook, LinkedIn, Snapchat, Instagram, WhatsApp. Besides, Industries have been following the same approach. They are being logged every piece of data, which will be needed to examine; Realtime data, which is a generated continuously known as Streaming data (Live data). Industries such as healthcare, retail, manufacturing, Finance, Banking, Insurance, Education, Transportation, Supply chain management and logistics, Agriculture, Energy, Government, Hospitality, Professional Services, Sports are generating the numerous amounts of data daily with different format such as text, audio, video, picture. In this article, we will discuss a practical implementation with streamline data.

“What is Online Learning(Incremental learning)?”

Online learning is known as Streamline learning technique or Incremental learning in which input data ceaselessly expand the model’s maturity about the knowledge to train the model further. In the traditional machine learning process, Incremental learning refers to learning from streaming data.

Machine learning provides a robust solution to the industry with its current research. Major Industry application utilizes the present method in which data is given in the form of the batch, and is given meta-parameter to model's training. Besides, the Model needs to optimize its meta-parameters to provide maximum maturity of knowledge to the model. The model stops learning when it is producing an optimal result. In this approach, the model can be carefully chosen base on the given dataset. Incremental learning, in contrast, refers to the state of continuous model optimization based on continuously incoming data streams. This kind of model is present in self-driving car and robotics which is autonomously behave.

Little maths of Incremental Learning

In Supervised learning,

Data D = ((x1,y1), (x2,y2),…,(xm,ym)) as input x and outputs y.

The task is to infer the data M ≈ p(y|x) from such data. Machine learning often trains this kind of data in batch mode.

In Incremental learning, Data D is not a presented priorly but arrives over a while.

The task is to infer a trusted model Mt after every time step based on the example (xt, yt) and the previous model Mt-1 only.

It is realized by online learning approaches, which use training sample one by one, without knowing their number in advance, to optimize its internal cost function.

Algorithm Support Incremental Learning

A stochastic optimization technique easily achieves incremental learning, such as online back-propagation.

Support vector machine (SVM)
Radial Basis Function Networks (RBF)
Learning Vector Quantization (LVQ)
k-nearest neighbor (k-NN)
Logistic Regression
Decision Tree(DT)

Practical Implementation of Incremental Learning

Creme is a library for online machine learning, also known as incremental learning. Online learning is a machine learning regime where a model learns one observation at a time.

In contrast to batch learning, where all data is being processed at once. Incremental learning is desirable when the data is too large to fit in memory, or simply when you want to handle streaming data. In addition to many online machine learning algorithms, Creme provides utilities to extract features from a stream of data.

In the following example, we will be training a logistic regression to predict whether or not the price of electricity will increases or decreases in the subsequent 30 minutes. We will be utilizing an actual real-world data of electricity prices from New South Wales in Australia. The dataset will be able to stream by using the fetch_electricity() function from the data-sets module. Here is what the first observation looks like:

Installation of Creme: https://creme-ml.github.io/install.html

NoteBook Code:

# installation

> !pip install creme

Collecting creme

Downloading https://files.pythonhosted.org/packages/25/7f/11df4db8cdc957fc3134c9ac18d2f6446e9810416cd376ca7ed777e3c091/creme-0.4.4-cp37-cp37m-win_amd64.whl (558kB)

Requirement already satisfied: scipy>=1.3.0 in c:\users\prompt\anaconda3\envs\prompt\lib\site-packages (from creme) (1.3.3)

Collecting scikit-learn>=0.21.2

Using cached https://files.pythonhosted.org/packages/9d/10/1dd2e3436e13402cc2b16c61b5f7407fb2e8057dcc18461db0d8e3523202/scikit_learn-0.22-cp37-cp37m-win_amd64.whl

Requirement already satisfied: numpy>=1.16.4 in c:\users\prompt\anaconda3\envs\prompt\lib\site-packages (from creme) (1.17.4)

Requirement already satisfied: joblib>=0.11 in c:\users\prompt\anaconda3\envs\prompt\lib\site-packages (from scikit-learn>=0.21.2->creme) (0.14.1)

Installing collected packages: scikit-learn, creme

Found existing installation: scikit-learn 0.20.4

Uninstalling scikit-learn-0.20.4:

Successfully uninstalled scikit-learn-0.20.4

Successfully installed creme-0.4.4 scikit-learn-0.22

ERROR: pyod 0.7.5.1 has requirement scikit-learn<=0.21.*,>=0.19.1, but you'll have scikit-learn 0.22 which is incompatible.

In [2]:

from creme import datasets

In [3]:

X_y = datasets.fetch_electricity()

In [4]:

x, y = next(X_y)

In [5]:

Out[5]:

{'date': 0.0,

'day': 2,

'period': 0.0,

'nswprice': 0.056443,

'nswdemand': 0.439155,

'vicprice': 0.003467,

'vicdemand': 0.422915,

'transfer': 0.414912}

In [6]:

Out[6]:

True

In [7]:

from creme import datasets

from creme import linear_model

from creme import metrics

from creme import optim

from creme import preprocessing

In [8]:

X_y = datasets.fetch_electricity()

In [9]:

model = preprocessing.StandardScaler()

In [10]:

model |= linear_model.LogisticRegression(optimizer=optim.SGD(.1))

In [11]:

metric = metrics.Accuracy()

In [12]:

for x, y in X_y:

y_pred = model.predict_one(x) # Make a prediction

metric = metric.update(y, y_pred) # Update the metric

model = model.fit_one(x, y) # Update the model

In [13]:

print(metric)

Accuracy: 0.894642

More Example with Crème Package:

A quick overview of batch learning
A hands-on introduction to incremental learning
Bike-sharing forecasting (regression)
Building a simple time series model
The art of using pipelines
Debugging a pipeline
Handling uncertainty with quantile regression

Here a few resources if you want to do some reading:

Online learning – Wikipedia
What is online machine learning? – Max Pagels
Introduction to Online Learning – USC course
Online Methods in Machine Learning – MIT course
Online Learning: A Comprehensive Survey
Streaming 101: The world beyond batch
Machine learning for data streams
Data Stream Mining: A Practical Approach

References :

1. https://www.researchgate.net/publication/224720096_Overview_of_Some_Incremental_Learning_Algorithms

?2. https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-19.pdf

3. https://creme-ml.github.io/

要查看或添加评论，请登录

Ashish Patel ????的更多文章

Why Writing Clean Python Code is Still Non-Negotiable at MAANG – And How You Can Master It

2025年3月12日

Why Writing Clean Python Code is Still Non-Negotiable at MAANG – And How You Can Master It

As Python developers, we know that the importance of writing clean code has only grown with time. Clean code is easier…

10 条评论
Memory in LLMs Cuts Training Time by 30%—And Here’s What That Means for AI Agent Development

2024年12月31日

Memory in LLMs Cuts Training Time by 30%—And Here’s What That Means for AI Agent Development

Everyone’s obsessed with Large Language Models (LLMs) these days—thanks to their transformative potential across…

3 条评论
Over 62% of AI Teams Struggle with Model Deployment — PyTorch’s New Features Solve This, Saving Millions on Development

2024年11月15日

Over 62% of AI Teams Struggle with Model Deployment — PyTorch’s New Features Solve This, Saving Millions on Development

As AI becomes more integrated into business strategies, the need for effective, scalable deployment is critical, yet…

6 条评论
Why Companies Deploying RAG-Powered AI on Kubernetes See a 3x Boost in Customer Personalization

2024年10月25日

Why Companies Deploying RAG-Powered AI on Kubernetes See a 3x Boost in Customer Personalization

Have you ever wondered why some companies seem to "get" customer personalization right, while others fall short?…

3 条评论
Generative AI with Amazon Bedrock: Enterprise LLMs Practise Guide

2024年7月29日

Generative AI with Amazon Bedrock: Enterprise LLMs Practise Guide

What a Master Piece..

8 条评论
Training-Free Long-Context Scaling of Large Language Models

2024年6月3日

Training-Free Long-Context Scaling of Large Language Models

Introduction The ability of Large Language Models (LLMs) to process and generate coherent text diminishes when input…

1 条评论
OpenELM: A Milestone in Open Source Language Modeling

2024年4月27日

OpenELM: A Milestone in Open Source Language Modeling

OpenELM: A Paradigm Shift in Language Model Transparency The reproducibility and transparency of large language models…

2 条评论
The Art of Training LLMs: Navigating the Toolkit Beyond Rewards for LLMs

2024年1月12日

The Art of Training LLMs: Navigating the Toolkit Beyond Rewards for LLMs

Imagine you're training a puppy. How do you teach it good behavior? You offer rewards for things you like, maybe a…

2 条评论
Exploring Mixtral 8x7B: Deep Dive into its Architectural Wonders

2023年12月15日

Exploring Mixtral 8x7B: Deep Dive into its Architectural Wonders

Mixtral 8x7B, a dominant force in the realm of Natural Language Processing (NLP), has intrigued researchers with its…

1 条评论
Discover the World of Graph Analytics: A Python Guide to Graph Data Modeling

2023年8月22日

Discover the World of Graph Analytics: A Python Guide to Graph Data Modeling

Introduction In today's data-driven world, organizations across industries are increasingly relying on graph approaches…

6 条评论

See all articles

Incremental learning - The Live Machine learning model training Approach

Ashish Patel ????

Sr AWS AI ML Solution Architect at IBM | Generative AI Expert Strategist | Author Hands-on Time Series Analytics with Python | IBM Quantum ML Certified | 12+ Years in AI | IIMA | 100k+Followers | 6x LinkedIn Top Voice |

“Why data is so essential to each industry?”

“What is Online Learning(Incremental learning)?”

Little maths of Incremental Learning

Algorithm Support Incremental Learning

Practical Implementation of Incremental Learning

NoteBook Code:

Ashish Patel ????的更多文章

社区洞察

其他会员也浏览了

Contrastive Learning: Transforming Representation Learning and Data Exploration

The AI Fusionist Playbook for Learning Theories

A Learning Team’s Guide to AI

Bad Students Make Great Teachers: Active Learning Accelerates Large Scale Visual Understanding

The 13 Best Machine Learning Courses on LinkedIn Learning to Consider

ACTIVE LEARNING AND ITS APPLICATIONS IN DRUG DISCOVERY

Article 3: Building Your AI and ML Skillset: Learning Pathways

Learning analytics and machine learning in higher education with Mike Sharkey

How Data Science and Analytics can help Hyper-personalise Learning at Scale

Learning Code 3 Lightens the Load

“Why data is so essential to each industry?”

“What is Online Learning(Incremental learning)?”

Little maths of Incremental Learning

Algorithm Support Incremental Learning

Practical Implementation of Incremental Learning

NoteBook Code:

Ashish Patel ????的更多文章

Why Writing Clean Python Code is Still Non-Negotiable at MAANG – And How You Can Master It

Memory in LLMs Cuts Training Time by 30%—And Here’s What That Means for AI Agent Development

Over 62% of AI Teams Struggle with Model Deployment — PyTorch’s New Features Solve This, Saving Millions on Development

Why Companies Deploying RAG-Powered AI on Kubernetes See a 3x Boost in Customer Personalization

Generative AI with Amazon Bedrock: Enterprise LLMs Practise Guide

Training-Free Long-Context Scaling of Large Language Models

OpenELM: A Milestone in Open Source Language Modeling

The Art of Training LLMs: Navigating the Toolkit Beyond Rewards for LLMs

Exploring Mixtral 8x7B: Deep Dive into its Architectural Wonders

Discover the World of Graph Analytics: A Python Guide to Graph Data Modeling

社区洞察

其他会员也浏览了

Contrastive Learning: Transforming Representation Learning and Data Exploration

The AI Fusionist Playbook for Learning Theories

A Learning Team’s Guide to AI

Bad Students Make Great Teachers: Active Learning Accelerates Large Scale Visual Understanding

The 13 Best Machine Learning Courses on LinkedIn Learning to Consider

ACTIVE LEARNING AND ITS APPLICATIONS IN DRUG DISCOVERY

Article 3: Building Your AI and ML Skillset: Learning Pathways

Learning analytics and machine learning in higher education with Mike Sharkey

How Data Science and Analytics can help Hyper-personalise Learning at Scale

Learning Code 3 Lightens the Load