登录查看更多内容

Catastrophic forgetting in machine learning: What it is and how to overcome it

Scanbot SDK

Wir bauen die schnellsten, genausten und verl?sslichsten L?sungen zur mobilen Datenerfassung für Ihre App oder Webseite.

发布日期: 2022年11月14日

Deep learning models have been making great strides by emulating the neural networks of the human brain. However, continual learning, which comes naturally to all of us, remains a challenge for artificial intelligence. One of the reasons for this is “catastrophic forgetting.” We’ll tell you why it occurs and what can be done to mitigate its effects.

What is catastrophic forgetting?

Modern software utilizes increasingly advanced deep learning techniques to perform tasks that seemed impossible for machines only a few years prior. In some areas, however, artificial neural networks still have a long way to go before they can compete with the human brain, especially when it comes to lifelong learning. The reason for this lies in the way these algorithms operate.

Training a machine learning engine involves so-called weights. Simply put, when a machine learning model receives an input, a weight is applied to approximate a certain desired output. For example, if you input a dog’s photo, you want your model to categorize it as a dog. If it outputs “cat” instead, the weights have to be adjusted.

Now imagine that you’ve successfully trained your model on dog pictures, which it recognizes 99% of the time. Then, you start another training session, this time for bird pictures. The model readjusts its weights to recognize birds – and thereby loses its ability to identify dogs. This effect is called catastrophic forgetting or catastrophic interference.

How to prevent catastrophic forgetting

Several solutions have been proposed to mitigate the effects of this phenomenon. After all, training a machine learning model for a specific task takes a lot of time, so losing progress is highly inefficient.

One way to minimize catastrophic forgetting is to ensure that when weights fine-tuned for task 1 are trained for task 2, they stay within a certain error threshold for task 1. This is called Elastic Weight Consolidation, where “elastic” means that changes in the weights are constrained more strongly the farther they drift from their previous state. Additionally, the weights most critical for good performance in task 1 are restricted more than less important ones.

Ajit Jaokar 5 个月前

DEEP LEARNING BASED OJECT RECOGNITION SYTEM: Analyzing…

Baron Ntambwe 6 个月前

Leveraging Transfer Learning for Computer Vision

DesiCrew Solutions Private Limited 1 年前

Another approach mimics the different functions of the human brain’s hippocampus and neocortex. Generally speaking, episodic memories are stored in the hippocampus, whereas the neocortex stores more general information. When certain memories are relevant in a broader sense, this information is transferred from the hippocampus to the neocortex.

In Bi-level Continual Learning, inputs are accordingly divided up between two models representing the hippocampus and the neocortex. The former bases its weight allocations on the latter and returns insights for future training. The goal is to combine a vast repository of general knowledge with the ability to quickly learn new tasks, just like humans do.

About Scanbot SDK's engine

Our own machine learning engine needs to recognize many different types of documents in order to reliably extract information from them. This makes minimizing the effects of catastrophic forgetting a priority.?

Our approach is modularization: Instead of letting our engine handle all of the tasks involved in recognizing and extracting document data by itself, we divide the tasks between several smaller modules. This means that modules can be trained individually to optimize their performance without affecting the others.

Previously, we worked on a single large model, but there were clear drawbacks. For one, training took up to 8 hours per task. For another, training this engine on different document types?– which nonetheless share many similarities –?sometimes led to mix-ups during the recognition phase. The modularization approach alleviates these problems, with a single module training session now only taking 15 to 120 minutes and less potential for confusion.

---------------------------------------------------------------------------------------------------------------

If you would like to learn more about Scanbot’s?Data Capture SDK, don’t hesitate to get in touch with our solution experts.?

Catastrophic forgetting in machine learning: What it is and how to overcome it

Scanbot SDK

Wir bauen die schnellsten, genausten und verl?sslichsten L?sungen zur mobilen Datenerfassung für Ihre App oder Webseite.

What is catastrophic forgetting?

How to prevent catastrophic forgetting

领英推荐

Applied Computer Vision

1,353 位关注者

Scanbot SDK的更多文章

社区洞察

其他会员也浏览了

Essential Concepts From Little Book of Deep Learning

Automating Neural Network Configuration with Keras-Tuner

What Are The Mechanics Of AI

An AI Evening Lecture

The Most Commonly Used Machine Learning Techniques, Explained

Friendly?? Guide to Neural Net!

Unleashing ResNet: A Game-Changer in Deep Learning

The Double Descent Hypothesis: How Bigger Models and More Data Can Hurt Performance

Deep Learning and Neural Networks

Adam, AdaGrad, RMSProp, Delta-Bar-Delta - Which Learning Rate Strategy Will Enhance Your Model?

What is catastrophic forgetting?

How to prevent catastrophic forgetting

领英推荐

Applied Computer Vision

1,353 位关注者

Scanbot SDK的更多文章

November 2024 – ???Compose Multiplatform SDK, C++ tutorial, and VIN cloning

How to use Firebase Remote Config to automatically update the Scanbot SDK license key – Capacitor guide

Creating a React Native Vision Camera Code Scanner – a step-by-step tutorial

What is the Expo framework and should you use it for React Native development?

October 2024 – ?? Our Document Scanner RTU UI v.2.0 is here!

Storage Wars: Web Edition – or, how we learned to store binary data effectively

ZXing Barcode Scanner tutorial – JavaScript library implementation

September 2024 – ?? Document Scanner RTU UI?v.2.0, Expo for React Native, and more

Kotlin vs. Swift: Comparing the powerhouses of Android and iOS development

August 2024 – ?? RTU UI for Flutter and MAUI, automatic license checks with Firebase, and more

社区洞察

其他会员也浏览了

Essential Concepts From Little Book of Deep Learning

Automating Neural Network Configuration with Keras-Tuner

What Are The Mechanics Of AI

An AI Evening Lecture

The Most Commonly Used Machine Learning Techniques, Explained

Friendly?? Guide to Neural Net!

Unleashing ResNet: A Game-Changer in Deep Learning

The Double Descent Hypothesis: How Bigger Models and More Data Can Hurt Performance

Deep Learning and Neural Networks

Adam, AdaGrad, RMSProp, Delta-Bar-Delta - Which Learning Rate Strategy Will Enhance Your Model?