登录查看更多内容

The Transformer chapter is online!

Andriy Burkov

PhD in AI, author of ?? The Hundred-Page Language Models Book and ?? The Hundred-Page Machine Learning Book, ML at TalentNeuron

发布日期: 2024年12月23日

Hey! I’ve been thinking about a special Christmas gift for my subscribers. How about the sixth chapter of my upcoming The Hundred-Page Language Models Book which I just put online (in addition to the other five chapters)?

In this chapter, you’ll read about the Transformer architecture, exploring:

The decoder block
Self-attention
Multi-head attention
Rotary position embeddings (RoPE)
Residual connections
Root mean square normalization (RMSNorm)

You’ll find plenty of math, illustrations, and Python code. By the end, you’ll have trained your own Transformer-based language model from scratch.

What better way to spend the holidays than by learning something new from a fun to read book?

Enjoy and Happy Holidays!

Artificial Intelligence

876,710 位关注者

Christophe Duvillard

Quantitative Portfolio Manager | Systematic & Discretionary Trader | Alpha-Generating Strategies | Machine Learning Enthusiast

2 个月

Great, thanks for sharing, Andriy. Happy Holidays!

2 次回应

Bo?tjan Dolin?ek

2 个月

OK Bo?tjan Dolin?ek

Subhadeep Sengupta

Australian Design Award Winner || International Stevie Award Winner || Global GOV Driven X Design Award Winner || Sydney Design Award Winner || Digital Transformation and Cybersecurity

2 个月

Hello Andriy Burkov , some parts and diagrams of this Chapter 7 are redacted. Will you be posting an updated version on this please?

1 次回应

Yen Tam

2 个月

I'm looking forward to your final work!

1 次回应

查看更多评论

要查看或添加评论，请登录

Andriy Burkov的更多文章

Artificial Intelligence #266

2025年3月13日

Artificial Intelligence #266

Hey, in this issue: companies are failing to convince staff of AI benefits; pioneers of reinforcement learning win the…

6 条评论
Artificial Intelligence #265

2025年3月6日

Artificial Intelligence #265

Hey, in this issue: How much energy will AI really consume?; how AI can achieve human-level intelligence; chatbots are…

10 条评论
Artificial Intelligence #264

2025年3月2日

Artificial Intelligence #264

Hey, in this issue: emerging patterns in building GenAI products; the state of machine learning competitions;…

15 条评论
Artificial Intelligence #264

2025年2月27日

Artificial Intelligence #264

Hey, in this issue: emerging patterns in building GenAI products; the state of machine learning competitions;…

12 条评论
Artificial Intelligence #263

2025年2月23日

Artificial Intelligence #263

Hey, in this issue: the end of programming as we know it; your most important customer may be AI; the impact of…

13 条评论
Artificial Intelligence #263

2025年2月20日

Artificial Intelligence #263

Hey, in this issue: the end of programming as we know it; your most important customer may be AI; the impact of…

4 条评论
Artificial Intelligence #262

2025年2月15日

Artificial Intelligence #262

Hey, in this issue: a first major win for an AI copyright case in the US; your AI can’t see gorillas; AI-designed…

13 条评论
Artificial Intelligence #262

2025年2月12日

Artificial Intelligence #262

Hey, in this issue: a first major win for an AI copyright case in the US; your AI can’t see gorillas; AI-designed…

9 条评论
Artificial Intelligence #261

2025年2月8日

Artificial Intelligence #261

Hey, in this issue: How are researchers using AI?; no hype DeepSeek R1 reading list; RAG best practices; robotic…

15 条评论
Artificial Intelligence #261

2025年2月6日

Artificial Intelligence #261

Hey, in this issue: How are researchers using AI?; no hype DeepSeek R1 reading list; RAG best practices; robotic…

10 条评论

See all articles

The Transformer chapter is online!

Andriy Burkov

PhD in AI, author of ?? The Hundred-Page Language Models Book and ?? The Hundred-Page Machine Learning Book, ML at TalentNeuron

Artificial Intelligence

876,710 位关注者

Andriy Burkov的更多文章

社区洞察

其他会员也浏览了

Episode 359 with Thea Flowers: Open Synths and Circuitpython

Day 8: Machine Learning Teach by Doing

Curious about the NEW Python in Excel feature? ??

This week in Flyte (May 20 -24)

Introduction to Discrete Chaotic Dynamical Systems

March 28, 2022 - Meetups and More

Cover the board with the right shapes using Pyomo.

Which Chart to use when -Bubble Chart_Part-6

Which Chart to use when -Scatter Plot_Part -4

Which Chart to use when - Line Chart_Part -1

Artificial Intelligence

876,710 位关注者

Andriy Burkov的更多文章

Artificial Intelligence #266

Artificial Intelligence #265

Artificial Intelligence #264

Artificial Intelligence #264

Artificial Intelligence #263

Artificial Intelligence #263

Artificial Intelligence #262

Artificial Intelligence #262

Artificial Intelligence #261

Artificial Intelligence #261

社区洞察

其他会员也浏览了

Episode 359 with Thea Flowers: Open Synths and Circuitpython

Day 8: Machine Learning Teach by Doing

Curious about the NEW Python in Excel feature? ??

This week in Flyte (May 20 -24)

Introduction to Discrete Chaotic Dynamical Systems

March 28, 2022 - Meetups and More

Cover the board with the right shapes using Pyomo.

Which Chart to use when -Bubble Chart_Part-6

Which Chart to use when -Scatter Plot_Part -4

Which Chart to use when - Line Chart_Part -1