Zero-Shot Learning
When we started working together, Abhijit (AdeptDC's Co-Founder) was skeptical with our approach: "does not machine learning take a lot of training data? How can we detect anomalies with a low volume of data?" His apprehensive was pretty legitimate. Without fail, machine learning and especially deep learning take a LOT of high-quality labeled data and hyper-parameter tunings. These algorithms are usually data hungry and struggle to learn from non-stationary data.
Human intelligence has two remarkable characteristics: quick learning and slow forgetting. A human can quickly learn from new experiences and tune its existing knowledge without forgetting the prior knowledge. Ideally, an AI agent should be showing similar capabilities, learning continually from a small volume of data and preserving the memory of past learning. This is a new class of continual learning, called continual low-shot learning.
The three key facets of continual low-shot learning are:
- Non-stationary data: A model can be trained in a continual data stream where new data become available in regular and non-regular intervals. The new data might have different distribution from the previous distribution.
- Efficiency. During model training and testing, the system resource consumption and computational complexity should be bounded.
- The small size of labels data chunk: The volume of training data points could be small (often less than 10).
Low-shot learning can be considered to be a meta-learning approach where the model has strong adaptability to change its representation based on the input data.
For dynamic applications and natural language processing problems, it is extremely hard to capture high-quality training data. In that context, few-shot learning could be extremely useful. In this domain, probably most challenging emerges from zero-shot learning which means training data points have no label in it.
What is Zero-Shot Learning?
Zero-Shot learning solves a task without receiving any example of that task at the training phase. The task of recognizing an object from a given collection of sample images where there weren’t any example images of that object during the training phase can be considered as an example of a zero-shot learning task. Actually, it simply allows us to recognize objects we have not seen before.
Why do we need zero-shot learning?
Conventional deep learning requires as many and as unique samples as possible to avoid potential bias and data imbalance. Imagine that we want to recognize a species that lives in the deep ocean where humans can hardly visit. It is not easy to collect sample images of these sort of animals. Even if you would achieve to collect a few images, remember images should not be similar and they should be as unique as possible. You need to make a lot of effort to achieve that. In addition to difficulty in data collection, data labeling could be pretty challenging. It might take significant subject matter expertise to accurately recognize a near-extinct species.