登录查看更多内容

Curiosity is All You Need: Learn How GPT Was Created in Just a Few Minutes

Alaaeddin Alweish

Solutions Architect & Lead Developer | Semantic AI | Graph Data Engineering & Analysis

发布日期: 2024年5月29日

GPT or Generative Pre-trained Transformer generates human-like text by predicting the next word in a sequence based on the provided context. Its development evolves through four stages:

Pretraining: Collect vast amounts of internet text and transform it into tokens for GPT to predict the next token in a sentence.
Supervised Fine-Tuning: Train GPT with question-answer pairs to improve accuracy and helpfulness in generating complete responses.
Reward Modeling: Use human feedback to assign higher rewards to preferred answers, guiding GPT to choose the best responses.
Reinforcement Learning: Refine GPT's responses through continuous feedback, maximizing high-reward interactions over time.

Let's briefly explain what happens at each of these stages:

1. Pretraining: Learning from the Internet

Data Collection: The first step is collecting vast amounts of text from the internet, including books, articles, and other resources. Some popular sources include CommonCrawl, Wikipedia, and other public datasets.

Pretraining - Step 1: Collect a huge amount of data from the internet.

Tokenization: All this text (sentences or chunks of words) gets turned into numbers or tokens. Why? It's easier for computers. Transforming complex content into manageable numerical tokens, enables easier processing, analysis, and improved model performance.

Pretraining - Step 2: Tokenization - Transforming text into tokens

Learning Patterns: GPT learns by predicting the next token in a sentence. For example, if it reads "To be or not to...", it learns that "be" might come next. Through billions of these predictions, GPT gets better at understanding language. Think of how children learn to speak. When children hear the same statements over and over, they start to understand which words follow each other. However, GPT (at the time of writing this article) only guesses the next token and, unlike a human child, doesn't truly understand or reason. Guessing, Really!!!?. Yes, this is how it works. GPT predicts the next word to complete a sentence based on the provided context.

Here are two important terms to understand at this stage:

Transformers: The engines behind GPT. They analyze the tokens and determine the context, using this information to predict what token comes next.
Positional encoding: A technique that gives GPT a sense of word order, which is crucial since the order in which words appear can completely change their meaning.

Pretraining - GPT completes text by predicting the next token in a sequence

2. Supervised Fine-tuning: Becoming an Assistant

With pretraining alone, we only get a text completion program. Pretraining teaches GPT to predict the next word in a sentence based on vast amounts of text from the internet. This results in a model that's good at completing sentences but not necessarily at answering questions or providing helpful responses. To build a helpful assistant, GPT needs a stage called "Supervised Fine-Tuning." Here's how it works:

GPT is given high-quality examples of questions and answers. For instance (Question: "What is the capital of France?"), (Answer: "The capital of France is Paris.")

领英推荐

DeepSeek-R1: Enhancing LLM Reasoning with…

Chander D. 1 个月前

Understanding Differences Between Encoding and…

Sanjay Kumar MBA,MS,PhD 1 年前

Deep Drive into DeepSeek for Deep Reasoning

Anindita Desarkar, PhD 1 个月前

Supervised Fine-tuning - The model ingests a large number of human-provided question-answer pairs

The model processes these pairs of questions and answers and learns to generate the correct response. How? Through algorithms like backpropagation (backward propagation of errors), the model adjusts its parameters to reduce the error in its responses. For example, if the model initially responds with just "Paris," it learns that a more complete response like "The capital of France is Paris" is preferred.

Supervised Fine-tuning - The model adjusts its parameters based on the provided questions and answers.

By adjusting its parameters based on these questions and answers, the model improves its accuracy and helpfulness. It learns to recognize the question format, response patterns, and to produce complete, grammatically correct sentences.

The major mechanism behind this process is called Self-Attention:

Self-attention: A mechanism that allows GPT to weigh the importance of each word relative to others in a sentence. This process is similar to how you might highlight or pay more attention to specific key words while reading to grasp the overall meaning.

Through this process, GPT transforms from a general text completer into an agent capable of responding to specific human queries.

3. Reward Modeling: Choosing the Best Answers

Reward Modeling helps GPT choose the best answers by reinforcing preferred responses through human feedback. It’s like a multiple-choice test, Here's how:

GPT generates multiple answers to a question.
Humans evaluate these answers and select the best ones.
The model assigns higher rewards to the selected best answers.
Over time, GPT adjusts its parameters to favor these high-reward answers, improving its ability to provide accurate and helpful responses.

Reward Modeling - Reinforcing preferred responses through human feedback

4. Reinforcement Learning: Getting Even Better

In the previous stage, we learned how Reward Modeling identifies and assigns rewards to preferred answers. In this stage, Reinforcement Learning uses those rewards in a continuous feedback loop to refine the model's ability to generate high-quality responses over time. Here's how it works:

Interaction with Environment: GPT continues to receive feedback through dynamic interactions, learning from responses that lead to new questions or scenarios.
Trial and Error: The model tries different responses and learns from the outcomes. High-reward responses are reinforced, leading to parameter adjustments to favor similar future responses. The Self-attention mechanisms help GPT learn from trial and error by focusing on which parts of previous interactions were successful in generating rewarding outcomes.
Maximizing Cumulative Rewards: The goal is to maximize total rewards over many interactions. The model continually improves by learning which types of responses yield the best outcomes.

Reinforcement Learning from Human Feedback - High-reward responses are reinforced in a continuous feedback loop

I hope this step-by-step guide has helped you understand GPT better. Feel free to ask questions or suggest additions in the comments below.

Thank you for reading!

If you found this article helpful, please consider sharing it with your connections. Engage with us in the comments below to show your support and motivate us to invest more time in simplifying complex topics and sharing knowledge.

带有此图标的链接由领英创建，不带此图标的链接由作者添加。

Zero to Hero: Learn AI Weekly

2,502 位关注者

Alaaeddin Alweish

Solutions Architect & Lead Developer | Semantic AI | Graph Data Engineering & Analysis

9 个月

Thank you all! The newsletter has now exceeded 1,000 subscriptions. will keep moving forward with this initiative, publishing on a weekly or bi-weekly basis.

1 次回应

Mohammed Bany El Marjeh

Experienced Architect & Tech Visionary | Leading Digital Transformation with .NET, Azure & Cutting-Edge Tech

9 个月

Great

1 次回应

Hisham A.

Software Engineer - Team Lead

9 个月

Great effort, keep it up!

2 次回应

Amjad Shammout

Solutions Architect | Data Architect | Data Analyst | Data Governance | Microsoft Azure

9 个月

Interesting! It's really useful ????

1 次回应

查看更多评论

要查看或添加评论，请登录

Alaaeddin Alweish的更多文章

Week 9: Is NLP "dead"? Natural Language Processing (NLP) and the Journey to GPT

2024年9月20日

Week 9: Is NLP "dead"? Natural Language Processing (NLP) and the Journey to GPT

Welcome back to our Zero to Hero Learn AI series! We have all been amazed by what GPT can do, whether it's writing…
Week 8: Deep Dive into Deep Learning and Neural Networks

2024年8月25日

Week 8: Deep Dive into Deep Learning and Neural Networks

Welcome back to our Zero to Hero Learn AI series! In this article, we'll dive deeper into Neural Networks and Deep…

2 条评论
Week 7: Reinforcement Learning (RL): Practical Overview and Applications

2024年8月8日

Week 7: Reinforcement Learning (RL): Practical Overview and Applications

We briefly introduced reinforcement learning (RL) as part of our Introduction to Machine Learning article. We used the…
Week 6: Unsupervised Machine Learning: Practical Overview and Applications

2024年7月30日

Week 6: Unsupervised Machine Learning: Practical Overview and Applications

In our previous article, we explored supervised learning in detail. This week, we will dive into another major branch…
Week 5: Supervised Machine Learning: A Simplified In-Depth Explanation

2024年7月20日

Week 5: Supervised Machine Learning: A Simplified In-Depth Explanation

In our previous article, we introduced supervised learning briefly. Today, we will dive deeper into this major branch…
Week 4: Introduction to Machine Learning

2024年7月4日

Week 4: Introduction to Machine Learning

Imagine a world where computers can think for themselves. That's the Machine Learning world! ML is a fascinating field…
Week 3: From Data to AI

2024年6月26日

Week 3: From Data to AI

Data is not just a component of AI; it is its lifeblood. Without data, AI cannot exist.

2 条评论
Week 2: AI in a Nutshell - 5 min Introduction

2024年6月14日

Week 2: AI in a Nutshell - 5 min Introduction

Welcome to the second week of our Zero to Hero AI learning series! This article will cover the basics of Artificial…

2 条评论
Week 1: AI Learning Paths: What to Learn and What's the Plan?

2024年6月6日

Week 1: AI Learning Paths: What to Learn and What's the Plan?

In less than two years, AI has become a leading trend, and the internet is now overflowing with countless tools and an…

2 条评论

See all articles

Curiosity is All You Need: Learn How GPT Was Created in Just a Few Minutes

Alaaeddin Alweish

Solutions Architect & Lead Developer | Semantic AI | Graph Data Engineering & Analysis

1. Pretraining: Learning from the Internet

2. Supervised Fine-tuning: Becoming an Assistant

领英推荐

3. Reward Modeling: Choosing the Best Answers

4. Reinforcement Learning: Getting Even Better

Zero to Hero: Learn AI Weekly

2,502 位关注者

Alaaeddin Alweish的更多文章

社区洞察

其他会员也浏览了

Top ML Papers of the Week (Jan 9-15)

GPT-4 explaining Self-Attention Mechanism

Embarking on Mastery: A Step-by-Step Guide to Becoming an Expert in Generative Artificial Intelligence

Engineering AI Brilliance: The Fortune 500 Approach to Revolutionary LLM Results

Prompt Engineering: The Key to Unlocking the Genie of Generative AI

BEST WAYS TO LEARN MACHINE LEARNING

AI Atlas #3: Transfer Learning

The Dynamics of Batch Machine Learning and Online Machine Learning

Advanced Techniques for Optimizing Ranking Models in Machine Learning

FrameFlow: Folding Proteins with the Flow, Not the Fold!

1. Pretraining: Learning from the Internet

2. Supervised Fine-tuning: Becoming an Assistant

领英推荐

3. Reward Modeling: Choosing the Best Answers

4. Reinforcement Learning: Getting Even Better

Zero to Hero: Learn AI Weekly

2,502 位关注者

Alaaeddin Alweish的更多文章

Week 9: Is NLP "dead"? Natural Language Processing (NLP) and the Journey to GPT

Week 8: Deep Dive into Deep Learning and Neural Networks

Week 7: Reinforcement Learning (RL): Practical Overview and Applications

Week 6: Unsupervised Machine Learning: Practical Overview and Applications

Week 5: Supervised Machine Learning: A Simplified In-Depth Explanation

Week 4: Introduction to Machine Learning

Week 3: From Data to AI

Week 2: AI in a Nutshell - 5 min Introduction

Week 1: AI Learning Paths: What to Learn and What's the Plan?

社区洞察

其他会员也浏览了

Top ML Papers of the Week (Jan 9-15)

GPT-4 explaining Self-Attention Mechanism

Embarking on Mastery: A Step-by-Step Guide to Becoming an Expert in Generative Artificial Intelligence

Engineering AI Brilliance: The Fortune 500 Approach to Revolutionary LLM Results

Prompt Engineering: The Key to Unlocking the Genie of Generative AI

BEST WAYS TO LEARN MACHINE LEARNING

AI Atlas #3: Transfer Learning

The Dynamics of Batch Machine Learning and Online Machine Learning

Advanced Techniques for Optimizing Ranking Models in Machine Learning

FrameFlow: Folding Proteins with the Flow, Not the Fold!