登录查看更多内容

Beyond LLM

Ritesh Vajariya

Global AI Strategy Leader | Head of GenAI @ Cerebras | Founder, AI Guru | Advisor to CEOs | Ex-AWS Gen AI Leader | Board Member

发布日期: 2024年7月9日

Remember when we thought AI was just about chatbots and funny image generators? I'm telling you, we're on the edge of an AI explosion. It's gonna be wild.

Look, I don't have a crystal ball, but after the crazy ride we've had with AI in the last year and a half, I've got a hunch the next 18 months are going to blow our minds. We're talking AI that doesn't just chat, but sees, hears, and maybe even thinks ahead of us. Wild, right?

A quick recap of last 18-months:

ChatGPT launch, GPT-4 release, Claude 2.x, 3.0 and 3.5 launches, LLaMA, LLaMA-2, LLaMA-3 launches, GPT-4o taking multimodality to the next level, Devin showing some proof that automation is coming shortly..
Klarna managed to handle two-third of their customer service chat via OpenAI's GPT-4
Bridgewater creating Investment Analyst Assistant helping their customers using Anthropic's Claude
Mayo Clinic working on helping clinicians review lengthy medical records used in the diagnosis and care of patients treatment planning with Cerebras Systems
Another mini-war between PC & Mac - this time on AI PC vs. Apple Intelligence!
and many more...

Last 18-months proved that AI is not going anywhere and from now on, we need to compound the growth in enabling more and more AI adoption - where we, as human, become smarter everyday.

There are many areas which shows potential growth but in this article I want to highlight few of them:

Beyond text and image:

While text have dominated last 18 months - thanks to ChatGPT, which is known to 63% population of the world (well, my mom doesn't know it yet!) - and we are now tired of seeing AI generated images - thanks to Runway, Stable Diffusion and DALL-E; we have seen how we can use GPT-4 and Claude via their vision APIs how they can see an image and able to do visual Q&A as well as provide us textual answer. Similar capabilities are developed in the open-source ecosystem via models like LLaVA or Phi. This shows that not only proprietary models (GPT-4, Claude) can do multimodality but open-source ecosystem is catching up fast.

What GPT-4o showed us on multimodality that trickled everyone's brain - combining not only text and images but also adding voice in the mix at a low latency.

While we have seen the demo from OpenAI on this, just last week, a french AI lab, Kyutai, actually went ahead and built it and shown it to entire world: In just 6 months, with a team of 8, the Kyutai research lab developed from scratch an artificial intelligence (AI) model with unprecedented vocal capabilities called Moshi.

Imagine combining this vocal capabilities with text and images, we can apply this to multiple industries: patient healthcare, making education accessible where not everyone has to have access to best schools, never get lost in any city, and many more...

I am optimistic that I won't get lost next time I visit Beijing - well, unsure on that, as not many western tech are accessible there.. but you got the point.

Agents everywhere:

Last year, hardly a week went by without a new LLM get released. Last few weeks it's all about agents and many more startups are being created to build more agents.

Those who are not living in the AI bubble would wonder what I am talking about - so let's do agent 101 very briefly.

We have to go back to the year 2011 when Daniel Kahneman published a book, Thinking Fast and Slow. 63% of the world who have used ChatGPT, Claude or other chat based system, they know that its super fast and gives us response in few seconds or less. In this context, these systems are "thinking fast", similar to using our subconscious mind. But what if we ask these systems to take it's own time, think slow, and apply the conscious mind instead. By just adding a time element, we are able to make wonders to these systems. They are able to "think" before they answer and due to the added thinking power, they are able to do things better - still the same LLM but given time to think. It's like our human brain - when we apply our conscious mind to an activity, we can do much better job.

This is where the "agents" come in where they are adding a component of planning, keeping certain things in memory for time being, apply reasoning and then generating response - creating "better than the best"!

Those who are looking for evidence of how LLMs can act as agent and improve the work, there is an open-source project called "Agent Bench".

The benchmark evaluate LLM-as-Agent across a diverse spectrum of different environments. It encompasses 8 distinct environments to provide a more comprehensive evaluation of the LLMs' ability to operate as autonomous agents in various scenarios.

领英推荐

All In On AI: How Smart Companies Win Big With…

Bernard Marr 2 年前

Overcoming the AI plateau

VentureBeat 10 个月前

The cost of AI innovation is unpredictable. Here’s…

New Relic 7 个月前

While LLMs begin to manifest their proficiency in LLM-as-Agent, gaps between models and the distance towards practical usability are significant. Above picture depicts that proprietary models from OpenAI and Claude are outperforming the open-source models when it comes to using LLM-as-Agent.

One must be wondering as what are some use cases agents will help us improve our lives? Well, literally all the use cases you have seen Generative AI solves for us, such as customer service, content creation, virtual tutoring for education, patient health outcomes.

Those who are hiring and want to create a job description that looks (and read) much better than standard ChatGPT output, take a look at my YouTube video describing this or head straight to the GitHub code for implementation.

How about some science?

While the adoption of AI in Sales, Finance, HR and many other departments have flourished and it's enhancing human capabilities of what we can do, how about we can apply some of these to the actual science? A drug discovery or superconductor research?

Yes, there are a lot of leapfrog on some of these area, such as when DeepMind's AlphaFold 3 can accurately predict the structure of proteins, DNA, RNA, ligands and more, and how they interact. We hope it will transform our understanding of the biological world and drug discovery. At the same time, teams at Johns Hopkins discovered new superconductor.

What if we can do similar advancement in other areas of physics and material science? There are molecular dynamics simulation being done at massive scale and we, at Cerebras, playing a significant role. But what if we can go beyond simulation and identify not only the object but a 3D shape with how that is constructed (aluminum, copper, plastic, etc.) or what's inside the object? What if we are able to detect whether there is a Coke inside the can or the water or something else?

The applications of these kind of solutions are immense in real life and while we are not there, we certainly heading in that direction.

Conclusion:

Wow, folks! If you thought the last 18 months were a wild ride in the AI world, just wait till you see what's coming next! We're talking about AI that doesn't just chat or make pretty pictures - we're entering a whole new dimension of cool.

Imagine AI that can see, hear, and talk back to you like a real person. Or how about AI agents that can actually think and plan? It's like giving AI a brain upgrade! And don't even get me started on what this could mean for science - we might be on the verge of some seriously mind-blowing discoveries.

But hey, let's not forget - with great AI power comes great responsibility. We've got to be smart about how we develop and use this stuff. It's not about replacing humans; it's about making us superhuman! We need to stay on our toes, keep our minds open, and be ready to roll with whatever AI throws our way.

So, buckle up! The next 18 months are going to be one heck of a ride in the AI world.

Shameless plug:

Did you know that Claude 3.5 is now the highest-performing LLM, even beating GPT-4o? Many people I know are thinking of canceling their ChatGPT subscriptions in favor of Claude.

But here's the thing - a lot of these folks don't know how to use Claude most effectively. That's why I created an on-demand course diving into the art and science of prompt engineering with Claude. It's perfect for everyday people who want to leverage Claude in their daily lives, covering techniques like chain-of-thought reasoning, few-shot learning, applying personas, and so much more.

If you know someone who could benefit from learning all this (and trust me, there's plenty more), why not gift them this course? It could be a game-changer for how they interact with AI!

AI with Ritesh (AI Guru)

3,020 位关注者

要查看或添加评论，请登录

Ritesh Vajariya的更多文章

Rethinking Clinical Documentation with AI

2025年2月21日

Rethinking Clinical Documentation with AI

Picture this: A physician rushing between patient visits, typing furiously during encounters, and staying late to…
The AI Revolution in HR: Transforming Talent Management and Employee Experience

2025年2月10日

The AI Revolution in HR: Transforming Talent Management and Employee Experience

Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning & AI. Enjoy…
The AI Cost Revolution: How a Small Company Disrupted the Industry's Economics

2025年1月28日

The AI Cost Revolution: How a Small Company Disrupted the Industry's Economics

Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning & AI. Enjoy…

13 条评论
AI Agents Come of Age: How Enterprises Are Driving Real Value in 2025

2025年1月12日

AI Agents Come of Age: How Enterprises Are Driving Real Value in 2025

Executive Summary: As AI agents move from experimental projects to production deployments in 2025, they're reshaping…
Let's Talk Transparency: Real Talk About AI in Marketing

2024年12月15日

Let's Talk Transparency: Real Talk About AI in Marketing

Welcome to 32nd article of this newsletter. Thank you for being part of AI enthusiast community, where you receive…

3 条评论
AI in Action: The Evolution of Sales Analytics

2024年11月27日

AI in Action: The Evolution of Sales Analytics

Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning & AI. Enjoy…

4 条评论
From Pit Stop to Pole Position: AI's Ferrari Moment

2024年11月19日

From Pit Stop to Pole Position: AI's Ferrari Moment

Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning & AI. Enjoy…
AI's unprecedented progress

2024年10月23日

AI's unprecedented progress

?? AI Brief Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning…
Fast Inference in Generative AI: A Game Changer

2024年9月13日

Fast Inference in Generative AI: A Game Changer

Introduction Generative AI has revolutionized numerous industries, from content creation to scientific research…

2 条评论
Revolutionizing Education Through Multisensory AI

2024年8月18日

Revolutionizing Education Through Multisensory AI

Thank you for being part of AI enthusiast community, where you receive weekly insights on Machine Learning & AI. Enjoy…

See all articles

Beyond LLM

Ritesh Vajariya

Global AI Strategy Leader | Head of GenAI @ Cerebras | Founder, AI Guru | Advisor to CEOs | Ex-AWS Gen AI Leader | Board Member

Beyond text and image:

Agents everywhere:

领英推荐

How about some science?

Conclusion:

AI with Ritesh (AI Guru)

3,020 位关注者

Ritesh Vajariya的更多文章

社区洞察

其他会员也浏览了

Wait, Maybe We Should Regulate Data, and Not Companies

AI's Defining Year: 2024 in Review & A Glimpse into 2025

How To Become a Feedback Champ

From Black Box to Open Book: OpenAI’s Push for AI Explainability

A turning point in AI for healthcare

AI Week in Review: The Most Historic Week in Generative AI’s History, Yet.

GenAI Weekly — Edition 22

Large Multi-Modal (LMM) AI Changes Everything (again)!

Elon Musk and xAI unveil Grok 3: The next leap in Artificial Intelligence

?? Pick GPT’s brain

Beyond text and image:

Agents everywhere:

领英推荐

How about some science?

Conclusion:

AI with Ritesh (AI Guru)

3,020 位关注者

Ritesh Vajariya的更多文章

Rethinking Clinical Documentation with AI

The AI Revolution in HR: Transforming Talent Management and Employee Experience

The AI Cost Revolution: How a Small Company Disrupted the Industry's Economics

AI Agents Come of Age: How Enterprises Are Driving Real Value in 2025

Let's Talk Transparency: Real Talk About AI in Marketing

AI in Action: The Evolution of Sales Analytics

From Pit Stop to Pole Position: AI's Ferrari Moment

AI's unprecedented progress

Fast Inference in Generative AI: A Game Changer

Revolutionizing Education Through Multisensory AI

社区洞察

其他会员也浏览了

Wait, Maybe We Should Regulate Data, and Not Companies

AI's Defining Year: 2024 in Review & A Glimpse into 2025

How To Become a Feedback Champ

From Black Box to Open Book: OpenAI’s Push for AI Explainability

A turning point in AI for healthcare

AI Week in Review: The Most Historic Week in Generative AI’s History, Yet.

GenAI Weekly — Edition 22

Large Multi-Modal (LMM) AI Changes Everything (again)!

Elon Musk and xAI unveil Grok 3: The next leap in Artificial Intelligence

?? Pick GPT’s brain