登录查看更多内容

How do LLMs work? The Transformer Architecture Explained

Louis-Fran?ois Bouchard

Making AI accessible. ?? What's AI on YouTube. Co-founder at Towards AI. ex-PhD Student.

发布日期: 2023年8月10日

Good morning fellow AI enthusiast! This week's iteration focuses on no other than large language models, like GPT, Llama, Claude, and others. These models all have building blocks in common to understand and generate words. Let's dive into how they do that!

Oh, and if you are using GPT models, you will love the sponsors of this iteration!

Receive the weekly digest right in your emails ??

[Sponsor] Lacking visibility into your GPT-based applications?

GPT-based applications require a different monitoring approach. Developers must:

Optimize token usage
Ensure API health
Continuously improve prompt quality
Avoid user facing catastrophes?

Run efficient and high-quality GPT-based products. Get started with Mona for free by adding two lines of code!

领英推荐

DeepSeek’s “Aha Moment”: The Next AI Revolution or…

Walter Adamson 1 个月前

The Failure of AI models in EnigmaEval Benchmark:…

Biplab Pal, PhD 1 个月前

AI in Transition: Lessons from 2024 and the Road Ahead…

Samy Melaine 2 个月前

How do LLMs work? The Transformer Architecture Explained with Jay Alammar

A few weeks ago, I had the luck to talk with probably the best educator in the AI space in my podcast: Jay Alammar from Cohere . During the interview, I asked if he could explain the architecture behind all recent large language models as simply as possible.

More precisely, we dived into the generative parts of the transformers' architecture and its different building blocks (e.g., tokenizers, attention, feed-forward networks...). If these don't ring any bell, this video is for you!

After this video, you’ll have a good overview of how LLMs answer questions but, even more importantly, how this type of AI model can predict the next word of a sentence…

We are incredibly grateful that?the newsletter?is now read by over 12'000+ incredible human beings counting our email list and LinkedIn subscribers. Reach out [email protected]?with any questions or details on sponsorships or visit my Passionfroot profile. Follow our newsletter at Towards AI, sharing the most exciting news, learning resources, articles, and memes from our Discord community weekly.

If you need more content to go through your week, check out the podcast!

Thank you for reading, and we wish you a fantastic week! Be sure to have?enough rest and sleep!

Louis

The What's AI Newsletter

14,770 位关注者

Worth A Yes

1 年

Wow, this week's newsletter on large language models like GPT, Llama, and Claude is incredibly intriguing! It's fascinating to explore the shared building blocks they use to understand and generate words. Thanks for bringing this topic to light, it's such a valuable resource for AI enthusiasts like us. Join us in staying up to date with all the latest AI applications in various industries by subscribing to our bi-weekly newsletter here: https://goodaivibes.substack.com/. Let's keep the AI vibes going strong! #llms #llm #gpt #chatgpt

CHESTER SWANSON SR.

Realtor Associate @ Next Trend Realty LLC | HAR REALTOR, IRS Tax Preparer

1 年

Thanks for Posting.

2 次回应

查看更多评论

要查看或添加评论，请登录

Louis-Fran?ois Bouchard的更多文章

How ChatGPT Actually Works - no math, no code

2025年3月23日

How ChatGPT Actually Works - no math, no code

You might have heard that AI can do all sorts of mind-blowing stuff, from talking to you like a human to generating…

1 条评论
How FlashMLA Cuts KV Cache Memory to 6.7%

2025年3月20日

How FlashMLA Cuts KV Cache Memory to 6.7%

Good morning everyone! This is Louis-Fran?ois from Towards AI, and if you’ve watched my previous videos on embeddings…

1 条评论
OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

2025年3月17日

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

Good morning! Have you ever wanted to take a language model and make it answer the way you want without needing a…
Python Programming with AI

2025年3月7日

Python Programming with AI

Good morning, and welcome to this very first video lesson of our Python course! Whether you’re someone who has dabbled…

1 条评论
Want to start programming in the AI era? This is for you...

2025年2月28日

Want to start programming in the AI era? This is for you...

Good morning! If you’ve been wanting to break into AI development but feel like your coding foundation isn’t quite…
Using AI for Writing

2025年2月17日

Using AI for Writing

Good morning! We’ve (Towards AI) been using AI to research, plan, help us with drafts, and refine our lessons for our…

4 条评论
How LLMs Are Changing Every Job

2025年2月12日

How LLMs Are Changing Every Job

Good morning! Today, I’m sharing our third video out of 6 we made for our “8-hour Generative AI Primer” course. In this…
LLM Developers: The future of software development

2025年2月6日

LLM Developers: The future of software development

Software engineers vs. ML engineers vs.

1 条评论
Real Agents vs. Workflows

2025年2月3日

Real Agents vs. Workflows

What most people call agents aren’t agents. I’ve never really liked the term “agent”, until I saw this recent article…

1 条评论
CAG vs RAG: Which One to Use?

2025年1月30日

CAG vs RAG: Which One to Use?

If you're using ChatGPT or other AI models, you've probably noticed they sometimes give incorrect information or…

3 条评论

See all articles

How do LLMs work? The Transformer Architecture Explained

Louis-Fran?ois Bouchard

Making AI accessible. ?? What's AI on YouTube. Co-founder at Towards AI. ex-PhD Student.

[Sponsor] Lacking visibility into your GPT-based applications?

领英推荐

How do LLMs work? The Transformer Architecture Explained with Jay Alammar

The What's AI Newsletter

14,770 位关注者

Louis-Fran?ois Bouchard的更多文章

社区洞察

其他会员也浏览了

The Possibilities of Generative AI Agents (GAIAS)

NewMind AI Journal #35

The Evolution of Multimodal Model Architectures: A Journey Towards Enhanced AI Understanding

Harnessing the Power of Vision and Text Multimodal Systems

Real AGI Machines (RAGIM) = Interaction/Causality Engine (ICE) + Generative AI (GenAI) +LMLMs...

Unpacking AI’s Journey to 3D Understanding: The Path to Spatial Intelligence ??

Data-Centric AI vs. Model-Centric AI: A Comprehensive Guide with Privacy Considerations (AI Newsletter - 2)

How Classic AI Remains Key in Industry While Everyone Talks About LLMs

AI's New Apex Predator: Transformers vs Mamba (Part 1)

Transforming AI Memory: The Promise of Infinite Context with Infini-Attention

[Sponsor] Lacking visibility into your GPT-based applications?

领英推荐

How do LLMs work? The Transformer Architecture Explained with Jay Alammar

The What's AI Newsletter

14,770 位关注者

Louis-Fran?ois Bouchard的更多文章

How ChatGPT Actually Works - no math, no code

How FlashMLA Cuts KV Cache Memory to 6.7%

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

Python Programming with AI

Want to start programming in the AI era? This is for you...

Using AI for Writing

How LLMs Are Changing Every Job

LLM Developers: The future of software development

Real Agents vs. Workflows

CAG vs RAG: Which One to Use?

社区洞察

其他会员也浏览了

The Possibilities of Generative AI Agents (GAIAS)

NewMind AI Journal #35

The Evolution of Multimodal Model Architectures: A Journey Towards Enhanced AI Understanding

Harnessing the Power of Vision and Text Multimodal Systems

Real AGI Machines (RAGIM) = Interaction/Causality Engine (ICE) + Generative AI (GenAI) +LMLMs...

Unpacking AI’s Journey to 3D Understanding: The Path to Spatial Intelligence ??

Data-Centric AI vs. Model-Centric AI: A Comprehensive Guide with Privacy Considerations (AI Newsletter - 2)

How Classic AI Remains Key in Industry While Everyone Talks About LLMs

AI's New Apex Predator: Transformers vs Mamba (Part 1)

Transforming AI Memory: The Promise of Infinite Context with Infini-Attention