登录查看更多内容

Is Chatgpt only a Big Model?(1)

Xingyu Ma

Music Generation Algorithm/ AIGC/AGI

发布日期: 2023年3月12日

StartPoint

Chatgpt has gained a lot of attetion. But most nlper thinks that reinforcement learning(RL) is only for Big Language model like GPT-3.

RL with text generation has been a method in text summary.Before 2019, most people thinks that reinforcement learning for text summarization is useless in the industry, it can only be used to write papers.

At now as we know that Chatgpt has proved that RL has a higher upper bound.Lacking of a big model or enough GPUs, most engineers do not want to have a try and even do not read the paper[1].

If we look carefully in the paper.There is a sentence:

"On our test set, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3."

It means reinforcement learning can work in a model that can train on a 48G GPU.

And then if we deeply read the paper.We can find the cost of data and GPU also is small enough that can train even on 4 GPUs.

Let's have a look.

领英推荐

Discover the New Capabilities of OpenAI's GPT-4o and…

DaCodes. 9 个月前

5 Real-Life Examples of GPT 4's Truly Multimodal…

Akaike Technologies 1 年前

??Top ML Papers of the Week

DAIR.AI 1 年前

"The cost of collecting our data and the compute for training runs, including experimental runs is a fraction of what was spent to train GPT-3: training our 175B SFT model requires 4.9 petaflops/s-days and training our 175B PPO-ptx model requires 60 petaflops/s-days, compared to 3,640 petaflops/s-days for GPT-3 (Brown et al., 2020). "

Refer to wiki[2] and openai blog[3], we convert the petaflops/s-days to the A6000 gpu cost.If we train a 1.3B SFT moel, we could only need 4 GPUs for 20 days.And we have a gain than more than about thousand of cost in language model training.

One A6000 in a day is 0.0379. If we assum that the model training time and model parameters have a square linear relationship, we need to 4gpu and 20 days to train a 3 petaflops/s-days model.

That means if we do not have a big model, we cann use RL to replace.

[1]Training language models to follow instructions with human feedback

[2]https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units

[3]https://openai.com/research/ai-and-compute

要查看或添加评论，请登录

Xingyu Ma的更多文章

Paper Note:Make-An-Audio

2023年4月5日

Paper Note:Make-An-Audio

The model use STFT as the inter feature.A HIFIGAN is used as the vocoder to convert STFT to wav.
Paper Note: RAVE

2023年3月25日

Paper Note: RAVE

Startpoints Improved the way vae models audio signals. Main innovation points: Two-stage training: representation…
Paper Notes: ACE-VC

2023年3月4日

Paper Notes: ACE-VC

ACE-VC is to disentangling the speech into linguistic content, speaker characteristics, and speaking style with self…
Accent Conversion:NON-PARALLEL ACCENT CONVERSION USING PSEUDO SIAMESE DISENTANGLEMENT NETWORK

2023年2月26日

Accent Conversion:NON-PARALLEL ACCENT CONVERSION USING PSEUDO SIAMESE DISENTANGLEMENT NETWORK

Accent is a local feature.Distengling the component of speech and reconstructing the speech is the methods.
Music prompt:VALL-E

2023年2月21日

Music prompt:VALL-E

StartPoint VALL-E likes a speech language model that predicts current audio code from past audio codes.This has a…
Music Prompt: MusicLM(1)

2023年2月16日

Music Prompt: MusicLM(1)

Startpoints Nearly, Text prompt has been a trend in cross-modal generative model. Taking advantage of the brevity…
Image Generating: Stable Diffsusion in different view behind intuition(1)

2023年2月13日

Image Generating: Stable Diffsusion in different view behind intuition(1)

Let see stable diffusion in style transfer view. Cross Attention Text encoder can be regarded as a style encoder.
Paper Notes: DDSP

2022年12月8日

Paper Notes: DDSP

Start Points Tranditional DSP algorithm can produce high quality instrument sounds.As there is a lot of paramters to be…
Paper Notes: FreeVC

2022年11月29日

Paper Notes: FreeVC

Issues Text-based VC models need labeled data.Text-free approaches has lots of defects.

See all articles

Is Chatgpt only a Big Model?(1)

Xingyu Ma

Music Generation Algorithm/ AIGC/AGI

StartPoint

领英推荐

Xingyu Ma的更多文章

社区洞察

其他会员也浏览了

ChatGPT is suddenly popular, and these chips will benefit!

Prompt Engineering Guide for Deep Research with ChatGPT’s O3-Model

Chinese AI startup "DeepSeek" vs US OpenAI "ChatGPT" - AI Rivals

!!!OpenAI announces ChatGPT successor GPT-4!!!

The First Open-Source LLaMA Implementation Based on Reinforcement Learning from Human Feedback is Here: ChatLLaMA (RLHF)

The Flow Report | Edition 004

Founder of Gmail Claims Chat GPT can ‘Kill’ Google in a Mere Two Years

From LISP to LLM: The Evolution of AI-Driven Aircraft Recognition

OpenAI Unveils GPT-4, An Advancement to ChatGPT- This will revolutionalize your search experiences!

Everything you need to know about OpenAI's ChatGPT

StartPoint

领英推荐

Xingyu Ma的更多文章

Paper Note:Make-An-Audio

Paper Note: RAVE

Paper Notes: ACE-VC

Accent Conversion:NON-PARALLEL ACCENT CONVERSION USING PSEUDO SIAMESE DISENTANGLEMENT NETWORK

Music prompt:VALL-E

Music Prompt: MusicLM(1)

Image Generating: Stable Diffsusion in different view behind intuition(1)

Paper Notes: DDSP

Paper Notes: FreeVC

社区洞察

其他会员也浏览了

ChatGPT is suddenly popular, and these chips will benefit!

Prompt Engineering Guide for Deep Research with ChatGPT’s O3-Model

Chinese AI startup "DeepSeek" vs US OpenAI "ChatGPT" - AI Rivals

!!!OpenAI announces ChatGPT successor GPT-4!!!

The First Open-Source LLaMA Implementation Based on Reinforcement Learning from Human Feedback is Here: ChatLLaMA (RLHF)

The Flow Report | Edition 004

Founder of Gmail Claims Chat GPT can ‘Kill’ Google in a Mere Two Years

From LISP to LLM: The Evolution of AI-Driven Aircraft Recognition

OpenAI Unveils GPT-4, An Advancement to ChatGPT- This will revolutionalize your search experiences!

Everything you need to know about OpenAI's ChatGPT