登录查看更多内容

The brains behind Google's Bard

Shubham Choksi

Machine Learning |Software Development | Database Administrator

发布日期: 2023年5月19日

In this newsletter, we will discuss one of the components of Google's AI known as BARD. The component is known as Bi-directional Encoder Representations from Transformer(BERT) an approach that performs analysis of a given text in a bidirectional manner.

Fundamentally, there are 2 approaches to applying Pre-trained Language Representation

Embeddings from Language Models (ELM)
Generative Pre-trained Transformer (GPT)

However, both of these approaches have a flaw. They both use a unidirectional language model(Left to Right or Right to left) to produce general language representations. BERT uses bi-directional pretraining to represent general language. At a basic level, there are 2 steps in this framework.

Pre-training: Where unlabeled data is provided to the model to achieve different pretraining tasks
Fine tunning: The parameters formed during pre-training are fine tuned with labelled data

There are 2 models that are currently available

BERTbase (L = 12, H = 768, A = 12. TP = 110M)
BERTlarge (L = 24, H = 1024, A = 16, TP = 340M)

L = Total number of layers(Transformer layers)

H = Hidden Size

A = No. of self-attention heads

TP = Total Parameters

Pre Training

Task 1: Masked LM

The training data generator will randomly pick 15% of the words in a statement. If a position is chosen it will be replaced by [MASK] 80% of the time. Then a function T(i) will be used to randomly predict the masked word with cross-entropy loss.

Towards Data Science 2 个月前

RAG Techniques Every AI/ML/Data Engineer Should Know!

Pavan Belagatti 2 个月前

Almost Timely News: How Large Language Models Are…

Christopher Penn 1 年前

Task 2: Next Sentence Predictor:

Question Answering and Natural Language Interface are 2 tasks are based on understanding on the relationship between 2 sentences. The model is pre-trained on a corpus of data containing different sentences for a binarized next-sentence prediction task.

Pretraining Data:

BERT uses the Books Corpus(800M) words and the English Wikipedia (2,500M words) after removing tables, lists, and headers as a document-level corpus is more reliable.

Fine Tuning

All parameters pre-trained are now fin-tuned end to end. During fine-tuning input the sentence A and B are analogous to

1: sentence pairs in paraphrasing

2: hypothesis pairs in the establishment

3: Question-Passage pair in question answering

4: a degenerate text-? pair in text classification or sequence tagging.

It is impossible to discuss the complete BERT working on a single page, here is the link to the research paper published by Google explaining the complete working of BERT. If you are aspiring to become an Algorithm designer like me i would recommend you to read it

A few more technical newsletters are coming up stay tuned

The brains behind Google's Bard

Shubham Choksi

Machine Learning |Software Development | Database Administrator

领英推荐

The Golden Age of AI

148 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

?? All You Need to Know About Small Language Models

? Time for LLMs?

?? Getting RAG Right: All in One Go

??Top ML Papers of the Week

Advanced Retrieval-Augmented Generation (RAG) for LLMs: Transforming Enterprise Data from SAP, Workday, Salesforce, etc. into Context-Aware Insights

The Business Case for Open Source Large Language Models: A Deep Dive into Llama-2

natlagram: How We Translated Words to Diagrams With the Help of GPT and Kroki

How exactly LLM generates text?

Large Language Model or Large Data Compression Technique? The Illusion of Intelligence.

The Rise of Domain-Specific Large Language Models and Why it Matters to Organizations

领英推荐

The Golden Age of AI

148 位关注者

Exploring the Potential of GANs: Advancements in Artificial Intelligence

2023年6月28日

The Alphaphold Breakthrough

2023年6月20日

Bias in AI: Exploring the Ethical Dimensions

2023年5月17日

Is AI actually a Job Snitch?

2023年5月16日

In the golden age of AI, how would you use it effectively?

2023年5月15日

Linear Regression: A Simple and Powerful Tool for Data Analysis

2023年5月13日

The never before seen Text-Video AI

2023年5月11日

The AI Engineer

2023年5月10日