登录查看更多内容

Large Language Models - How are the OpenAI GPT models trained?

Ananya Ghosh Chowdhury

Data and AI Architect at Microsoft | Public Speaker | Startup Advisor | Career Mentor | Harvard Business Review Advisory Council Member | Marquis Who's Who Listee | Founder @AIBoardroom

发布日期: 2023年4月19日

Large Language Models (LLMs) are based on the principles of neural networks, which are networks of artificial neurons connected together in layers. Each neuron can receive inputs from other neurons and produce an output determined by its weights, which are adjusted as the model is trained. LLMs work by taking in large amount?of text data and use that to learn the relationships between words and phrases.?

There are two types of language models:?

Generative language models generate text and other related content based on the learned language patterns and with language input.?
Discriminative language models analyze and sort a particular text into pre-defined categories.??

GPT stands for Generative pre-trained transformer. GPT-3, GPT-4,?ChatGPT?are all Large language models that utilizes deep learning to perform specific tasks; they?are aligned with user intent on a wide range of tasks by fine-tuning with human feedback.?

How are the?OpenAI?language models trained to follow instructions ??

The process starts with a set of labeler-written prompts and prompts submitted through the?OpenAI?API,?OpenAI?collects a dataset of labeler demonstrations of the desired model behavior which they use to fine-tune GPT-3 using supervised learning. Next, a dataset of rankings of model outputs is collected and used to further fine-tune this supervised model using reinforcement learning from human feedback : the resulting models are called?InstructGPT. In human evaluations, outputs from the 1.3B parameter?InstructGPT?model are preferred to outputs from the 175B GPT-3, as they show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. The?results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.??

To start with,?there is a pretrained language model,?a distribution of prompts on which the model is to produce aligned outputs, and a team of trained human labelers , then the following three steps are followed:?

Step 1: Supervised fine-tuning (SFT):Collect demonstration data, and train a supervised policy -?

The labelers provide demonstrations of the desired behavior on the input prompt distribution ,?a pretrained GPT-3 model is fine-tuned on this data using supervised learning.??

Step 2: Reward model (RM) training :?Collect comparison data, and train a reward model -?

?A dataset of comparisons between model outputs is collected, where labelers indicate which output they prefer for a given input; a reward model is then trained to predict the human-preferred output.??

Step 3: Reinforcement learning via proximal policy optimization (PPO) on this reward model:?Optimize a policy against the reward model using PPO -?

The output of the RM is used as a scalar reward; the supervised policy is fine tuned to optimize this reward using the PPO algorithm.??

Steps 2 and 3 can be iterated continuously; more comparison data is collected on the current best policy, which is used to train a new RM and then a new policy.??

Different OpenAI?Models:?

GPT-4?

GPT-4 is a large-scale, multimodal , transformer-style model?(accepting text inputs and emitting text outputs today, with image inputs coming in the future)??pre-trained to predict the next token in a document, using both publicly available data (such as internet data) and data licensed from third-party providers, then fine-tuned using Reinforcement Learning from Human Feedback (RLHF). It?improves on GPT-3.5 and can understand as well as generate natural language or code.?GPT-4 substantially improves the previous?OpenAI?models in the ability to follow user intent,?to understand and generate natural language text, particularly in more complex and nuanced scenarios.?GPT-4 is optimized for chat but works well for traditional completions tasks.?

The figure below shows the performance of GPT-4 in a variety of languages compared to prior models in English on Massive Multitask Language Understanding :?

The latest model is?gpt-4?and can handle a maximum of?8,192 tokens.?

GPT-3.5?

GPT-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the GPT-3.5 family is?gpt-3.5-turbo?which has been optimized for chat but works well for traditional completions tasks as well; it can handle a maximum of?4,096 tokens?

XenonStack 1 年前

LLM vs. LQM

Sanjiv Goyal 2 个月前

The Evolution of Large Language Models: From Theory to…

Reckonsys Tech Labs 3 个月前

GPT- 3?

GPT-3 is one of the largest publicly-disclosed language models — it has 175 billion parameters and was trained on 570 gigabytes of text. For comparison, its predecessor, GPT-2 (which is functionally similar to GPT-3) has 1.5 billion parameters and was trained on 40 gigabytes of text. While GPT-2 displayed some zero-shot generalization to downstream tasks, GPT-3 further displayed the ability to learn more novel tasks when given examples in context. It?has an unusually large set of capabilities, including text summarization, chatbot behavior, search, code generation, and essay generation?

ChatGPT?

ChatGPT?is a sibling model to?InstructGPT, it is a?conversational AI model that can chat with the users, answer follow-up questions, and challenge incorrect assumptions.?

DALL-E?

DALL·E is a AI system that can create realistic images and art from a description in natural language, it can create a new image with a certain size, edit an existing image, or create variations of a user provided image.?

Whisper?

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.?

Embeddings?

Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text, and are?useful for search, clustering, recommendations, anomaly detection, and classification tasks.?text-embedding-ada-002?is designed to replace the previous 16 first-generation embedding models at a fraction of the cost.?

Moderation?

The Moderation models?provide classification capabilities that look for content in the following categories: hate, hate/threatening, self-harm, sexual, sexual/minors, violence, and violence/graphic.?

Bringing it all together, GPT models are LLMs that are pre-trained with massive amount of data and can perform variety of natural language processing tasks (including Natural Language Understanding and Natural Language Generation) ; these models are finetuned by?using Reinforcement Learning from Human Feedback (RLHF) to understand human intent and can be further be fine-tuned for specific use cases.??

This could?change the way businesses run today and reinvent existing enterprise systems.?For example, a GPT model can be used to generate a personalized report for a customer, based on their preferences and past interactions and?allow businesses to provide a more customized and personal experience for their customers, which could lead to increased customer satisfaction and loyalty. Additionally, GPT models could be used to streamline and automate many business processes, such as customer service or order fulfillment. This would free up employees to focus on more creative and strategic tasks, leading to increased efficiency and productivity. To summarize, GPT models have the potential to revolutionize the way businesses operate, making them more customer-centric and efficient.?

References:?

2203.02155.pdf (arxiv.org) ?

2303.08774.pdf (arxiv.org) ?

2102.02503.pdf (arxiv.org) ?

Models - OpenAI API ?

Introducing ChatGPT (openai.com) ?

Dr. Kevin Brown, DBA, MBA

1 年

Very thorough analysis and breakdown of LLMs and how they are trained. Thanks Ananya Ghosh Chowdhury !!

Suman Kalyan Ghosh

1 年

So informative.

查看更多评论

要查看或添加评论，请登录

Ananya Ghosh Chowdhury的更多文章

Leveraging LLMLingua for Efficient Inference in Large Language Models

2024年10月12日

Leveraging LLMLingua for Efficient Inference in Large Language Models

Large Language Models (LLMs) have been at the forefront of numerous breakthroughs in Natural Language Processing (NLP),…

1 条评论
Exploring Phi-2: The Evolution of Small Language Models and their Impact on NLP Efficiency and Innovation

2024年2月5日

Exploring Phi-2: The Evolution of Small Language Models and their Impact on NLP Efficiency and Innovation

The development of small language models, such as the Phi-2, has opened up new possibilities in the field of Natural…

1 条评论
Transforming Retail and Consumer Goods with Generative AI: Real-World Use Cases

2024年2月1日

Transforming Retail and Consumer Goods with Generative AI: Real-World Use Cases

The retail and consumer goods industry is continuously adapting to meet the ever-evolving demands of modern consumers…

11 条评论
Demystifying the Transformer Architecture: A New Era in Natural Language Processing

2023年11月9日

Demystifying the Transformer Architecture: A New Era in Natural Language Processing

In recent years, Artificial Intelligence (AI) has witnessed remarkable advancements, with Natural Language Processing…
What is Generative AI?

2023年4月12日

What is Generative AI?

Artificial Intelligence (AI) is a broad field that encompasses many different technologies and techniques. Machine…
Data Replication Tools in Azure SQL Databases

2022年12月29日

Data Replication Tools in Azure SQL Databases

Data Replication includes data tracking, loading, streaming, and synchronizing functionalities. Change Data Capture…

1 条评论
From Coding to Decoding: A Consultant’s way to take informed business decisions

2019年1月26日

From Coding to Decoding: A Consultant’s way to take informed business decisions

Well, as a Technology Consultant, we have always been coding, learning new technologies, keeping ourselves up-to-date…

2 条评论

See all articles

Large Language Models - How are the OpenAI GPT models trained?

Ananya Ghosh Chowdhury

Data and AI Architect at Microsoft | Public Speaker | Startup Advisor | Career Mentor | Harvard Business Review Advisory Council Member | Marquis Who's Who Listee | Founder @AIBoardroom

领英推荐

Ananya Ghosh Chowdhury的更多文章

社区洞察

其他会员也浏览了

Comprehensive Overview of GPT, LLaMA, and PaLM Large Language Model Families

LLM Models

Decoding the AI Giants: GPT-3 vs GPT-3.5 vs GPT-4

Generative AI: The Science Behind Large Language Models - Simplified

GPT-3 vs GPT-4 | What’s the difference?

Fine-Tuning Large Language Models (LLMs) with Transfer Learning in a Spring Data Pipeline:

Tech Swara XII : AI's Quantum Leap - Language, Code and Vision.

Leveraging the Potential of Large Language Models

ChatGPT in the Age of Generative AI

领英推荐

Ananya Ghosh Chowdhury的更多文章

Leveraging LLMLingua for Efficient Inference in Large Language Models

Exploring Phi-2: The Evolution of Small Language Models and their Impact on NLP Efficiency and Innovation

Transforming Retail and Consumer Goods with Generative AI: Real-World Use Cases

Demystifying the Transformer Architecture: A New Era in Natural Language Processing

What is Generative AI?

Data Replication Tools in Azure SQL Databases

From Coding to Decoding: A Consultant’s way to take informed business decisions

社区洞察

其他会员也浏览了

Comprehensive Overview of GPT, LLaMA, and PaLM Large Language Model Families

LLM Models

Decoding the AI Giants: GPT-3 vs GPT-3.5 vs GPT-4

Generative AI: The Science Behind Large Language Models - Simplified

GPT-3 vs GPT-4 | What’s the difference?

Fine-Tuning Large Language Models (LLMs) with Transfer Learning in a Spring Data Pipeline:

Tech Swara XII : AI's Quantum Leap - Language, Code and Vision.

Leveraging the Potential of Large Language Models

ChatGPT in the Age of Generative AI