The brains behind Google's Bard
In this newsletter, we will discuss one of the components of Google's AI known as BARD. The component is known as Bi-directional Encoder Representations from Transformer(BERT) an approach that performs analysis of a given text in a bidirectional manner.
Fundamentally, there are 2 approaches to applying Pre-trained Language Representation
However, both of these approaches have a flaw. They both use a unidirectional language model(Left to Right or Right to left) to produce general language representations. BERT uses bi-directional pretraining to represent general language. At a basic level, there are 2 steps in this framework.
There are 2 models that are currently available
L = Total number of layers(Transformer layers)
H = Hidden Size
A = No. of self-attention heads
TP = Total Parameters
Pre Training
Task 1: Masked LM
领英推荐
Task 2: Next Sentence Predictor:
Pretraining Data:
BERT uses the Books Corpus(800M) words and the English Wikipedia (2,500M words) after removing tables, lists, and headers as a document-level corpus is more reliable.
Fine Tuning
All parameters pre-trained are now fin-tuned end to end. During fine-tuning input the sentence A and B are analogous to
1: sentence pairs in paraphrasing
2: hypothesis pairs in the establishment
3: Question-Passage pair in question answering
4: a degenerate text-? pair in text classification or sequence tagging.
It is impossible to discuss the complete BERT working on a single page, here is the link to the research paper published by Google explaining the complete working of BERT. If you are aspiring to become an Algorithm designer like me i would recommend you to read it
A few more technical newsletters are coming up stay tuned