登录查看更多内容

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

Louis-Fran?ois Bouchard

Making AI accessible. ?? What's AI on YouTube. Co-founder at Towards AI. ex-PhD Student.

发布日期: 2025年3月17日

+ 关注

Good morning!

Have you ever wanted to take a language model and make it answer the way you want without needing a mountain of data?

Well, OpenAI’s got something for us: Reinforcement Fine-Tuning, or RFT, and it changes how we customize AI models. Instead of retraining it with feeding examples of what we want and hoping it learns in the classical way, we actually teach it by rewarding correct answers and penalizing wrong ones, just like training a dog?—?but, you know, with fewer treats and more math.

Let’s break down reinforcement fine-tuning compared to supervised fine-tuning!

Both essentially have their use that we can discuss in one line:

Supervised fine-tuning teaches new things the model does not know yet, like a new language, which is powerful for small and less “intelligent” models.
While reinforcement fine-tuning orients the current model to what we really want it to say. It basically “aligns” the model to our needs, but we need an already powerful model. This is why reasoning models are a perfect fit.

I’ve already covered fine-tuning on the channel if you are interested in that. Today, let’s get into how RFT actually works! Read the article here or watch the video:

And that's it for this iteration! I'm incredibly grateful that?the What's AI newsletter?is now read by over 20,000 incredible human beings. Click here to share this iteration with a friend if you learned something new!

Looking for more cool AI stuff? ??

Looking for AI news, code, learning resources, papers, memes, and more? Follow our weekly newsletter at Towards AI!
Looking to connect with other AI enthusiasts? Join the Discord community: Learn AI Together!

Want to share a product, event or course with my AI community? Reply directly to this email, or visit my Passionfroot profile to see my offers.

Thank you for reading, and I wish you a fantastic week! Be sure to have?enough sleep and physical activities next week!

Louis-Fran?ois Bouchard

The What's AI Newsletter

14,756 位关注者

要查看或添加评论，请登录

Louis-Fran?ois Bouchard的更多文章

How FlashMLA Cuts KV Cache Memory to 6.7%

2025年3月20日

How FlashMLA Cuts KV Cache Memory to 6.7%

Good morning everyone! This is Louis-Fran?ois from Towards AI, and if you’ve watched my previous videos on embeddings…

1 条评论
Python Programming with AI

2025年3月7日

Python Programming with AI

Good morning, and welcome to this very first video lesson of our Python course! Whether you’re someone who has dabbled…

1 条评论
Want to start programming in the AI era? This is for you...

2025年2月28日

Want to start programming in the AI era? This is for you...

Good morning! If you’ve been wanting to break into AI development but feel like your coding foundation isn’t quite…
Using AI for Writing

2025年2月17日

Using AI for Writing

Good morning! We’ve (Towards AI) been using AI to research, plan, help us with drafts, and refine our lessons for our…

4 条评论
How LLMs Are Changing Every Job

2025年2月12日

How LLMs Are Changing Every Job

Good morning! Today, I’m sharing our third video out of 6 we made for our “8-hour Generative AI Primer” course. In this…
LLM Developers: The future of software development

2025年2月6日

LLM Developers: The future of software development

Software engineers vs. ML engineers vs.

1 条评论
Real Agents vs. Workflows

2025年2月3日

Real Agents vs. Workflows

What most people call agents aren’t agents. I’ve never really liked the term “agent”, until I saw this recent article…

1 条评论
CAG vs RAG: Which One to Use?

2025年1月30日

CAG vs RAG: Which One to Use?

If you're using ChatGPT or other AI models, you've probably noticed they sometimes give incorrect information or…

3 条评论
Why LLMs Are the Future of Work

2025年1月28日

Why LLMs Are the Future of Work

Good morning! Today, we start the new series of videos for our most recent Towards AI course: 8-hour Generative AI…

1 条评论
Introducing Our 8-Hour Generative AI Primer

2025年1月18日

Introducing Our 8-Hour Generative AI Primer

Once again, I’m super excited to share some news from the Towards AI team—we’ve just launched a brand-new 8-hour…

11 条评论

See all articles

The What's AI Newsletter

14,756 位关注者

Louis-Fran?ois Bouchard的更多文章

How FlashMLA Cuts KV Cache Memory to 6.7%

Python Programming with AI

Want to start programming in the AI era? This is for you...

Using AI for Writing

How LLMs Are Changing Every Job

LLM Developers: The future of software development

Real Agents vs. Workflows

CAG vs RAG: Which One to Use?

Why LLMs Are the Future of Work

Introducing Our 8-Hour Generative AI Primer

社区洞察