登录查看更多内容

Making Sense of LLMs - GPT4

Lukas N.P. Egger

VP of Product Strategy & Innovation @ SAP Signavio | AI Strategist | Product Discovery Expert | Thought Leader & Podcast Host

发布日期: 2023年3月15日

This is the sixth installment in a series about LLMs. You can find the fifth article here:?Making Sense of LLMs - A goal without alignment is just a wish.

Today, OpenAI released?GPT4. You will hear and read much about it in the upcoming days and weeks as we learn more about its capabilities and limitations.?

Following are a couple of noteworthy tidbits of information to get you updated with additional comments from my side:

Bing is powered by GPT-4?already. No surprise there, but kudos to the execution speed of Microsoft. Satya Nadella?recently stated?that Google is the 800-pound gorilla in the search market but that he intends to make them "dance," and what an invitation that has been.
Citing competition and safety concerns, OpenAI will not provide details on architecture, model size, hardware, or training data for their latest AI model. This sounds like the death rattle of open research, which has been the status quo for the last decade. The field will switch from a hugely collaborative to a fully proprietary model in the blink of an eye as each company tries to create a strategic moat around its business.?
The model is multi-modal, i.e., it works with images and is highly apt at understanding infographics and charts. Users can refer to content within pictures and even ask for high-level explanations like, "Why is this comic funny?" Adding more modalities like video and sound needs to be considered a mere engineering challenge at this point. More interestingly, it appears to be the case that multi-modality also improves the baseline performance.
The previous model, ChatGPT, was in the bottom 10% of a standardized bar exam. GPT-4 ranks in the top 10%. Certainly a cherry-picked stat, but still very impressive and one of the things the media undoubtedly will focus on.
New emergent capabilities, like hindsight neglect, appeared for the first time, which has been an elusive quality so far. This shows that some abilities do not improve gradually but instead jump into existence at a certain threshold, which is fascinating and suggests that LLMs might still have some surprises up their digital sleeves.
Hallucination?is still a problem, i.e., models creating non-factual but otherwise compelling content. However, major progress seems achievable with a better training regimen, including techniques like?supervised fine-tuning and RLHF. Unfortunately, this also means that one needs an army of manual labor, which could be prohibitive for smaller companies.
The cut-off date for training data is 2021, which means that GPT-4 still needs to find out who won the last football world cup. OpenAI completed the training of the base model last summer. This is especially interesting because it tells us that cleaning and refining data, respectively answers, is more important than adding more data, i.e., human post-processing outweighs other factors like compute.
Multiple companies like Duolingo have released?GPT-empowered product features?on the same day as the tech release. One might argue that they already worked with a chat-like interface, but still, this is no easy task to pull off, having everything ready from PR material to app updates, including pricing. I wonder if 'the ability to integrate LLM tech' will be the most significant competitive differentiator in the next couple of years.
They?red-teamed?GPT-4 with a third party, meaning they hired researchers to serve as bad-faith actors to find exploits and security issues. The results are worrying to dystopian, depending on your level of tech optimism. We can expect a tidal wave of sophisticated and personalized spam, but more concerning is the fact that the model excels at autocratic?disinformation?and power-seeking behavior.
I am especially inspired by the fact that LLMs are getting better at explanations out of the box. At the end of the?Developer Livestream, starting at 19:05, Greg Brockman gives an example of GPT-4 breaking down a tax code issue for him. He goes on to say:?"Only by asking the model to spell out its reasoning and me following along, I was like 'oh I get it now, I know why this works'… it doesn't care if it is code if it is language, all this can be applied toward the problems you care about."
There is a waitlist for the API access, but already the?cost has dropped by an order of magnitude. It must be nice to be bankrolled like this, crushing competition before the race even starts.

For more details on the performance, check out the accompanying?research material?and?scientific paper.

领英推荐

Mysterious GPT is Back...

Steve Nouri 6 个月前

This AI newsletter is all you need #39

Towards AI 1 年前

?? LLMs on Fire ??

AIM 7 个月前

In related news, Google does not catch a break. After botching their AI demo last month, which cost them a whopping?$100B in market cap. They announced a?massive roll-out of AI features?for their workspaces the same day GPT-4 came out. On the other hand, Meta used today's air cover to announce?further lay-offs?months in advance.

The unrelentless acceleration of change reminds me of a passage from Lewis Carroll’s book “Through the Looking Glass” in which Alice, the protagonist, runs together with the Red Queen without seeming to get anywhere:

‘Well, in our country,’ said Alice, still panting a little, ‘you’d generally get to somewhere else — if you ran very fast for a long time, as we’ve been doing.’

‘A slow sort of country!’ said the Queen. ‘Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!’

Looking at what has happened in the last months, I'd argue we still don't know where we are running but we certainly don't stop increasing the speed.

You can find the next article here:?Making Sense of LLMs - Questions, answers, and alpacas.

要查看或添加评论，请登录

Lukas N.P. Egger的更多文章

Making Sense fo LLMs - And my word shall be your command-prompt

2023年3月27日

Making Sense fo LLMs - And my word shall be your command-prompt

This is the ninth installment in a series about LLMs. You can find the previous article here: Making Sense of LLMs -…

2 条评论
Making Sense of LLMs - Artificial emotions and real product management.

2023年3月21日

Making Sense of LLMs - Artificial emotions and real product management.

This is the eighth installment in a series about LLMs. You can find the previous article here: Making Sense of LLMs -…
Making Sense of LLMs - Questions, answers, and alpacas.

2023年3月17日

Making Sense of LLMs - Questions, answers, and alpacas.

This is the seventh installment in a series about LLMs. You can find the previous article here: Making Sense of LLMs -…
Making Sense of LLMs - A goal without alignment is just a wish.

2023年3月11日

Making Sense of LLMs - A goal without alignment is just a wish.

This is the fifth installment in a series about LLMs. You can find the third article here: Making Sense of LLMs - A…

2 条评论
Making Sense of LLMs - A picture worth 1,000 neurons.

2023年3月9日

Making Sense of LLMs - A picture worth 1,000 neurons.

This is the fourth installment in a series about LLMs. You can find the third article here: Making Sense of LLMs - Data…

1 条评论
Making Sense of LLMs - Data gets to be the new oil again!

2023年3月7日

Making Sense of LLMs - Data gets to be the new oil again!

This is the third installment in a series about LLMs. You can find the second article here: Making Sense of Large…

4 条评论
Making Sense of LLMs - Where's the value at/add?

2023年3月6日

Making Sense of LLMs - Where's the value at/add?

This is the second installment in a series about LLMs. You can find the first article here: Making Sense of Large…
Making Sense of Large Language Models - An Introduction

2023年3月4日

Making Sense of Large Language Models - An Introduction

I have been asked to speak on a panel at an AI conference later this month. Invitations just like compliments are hard…

2 条评论
How to De-risk Innovative Projects Across the Data and Analytics World for Powerful Process Transformation

2022年9月1日

How to De-risk Innovative Projects Across the Data and Analytics World for Powerful Process Transformation

Last week, I had the opportunity to present at the Gartner Data & Analytics Summit, which took place from the 22nd –…

8 条评论

See all articles

Making Sense of LLMs - GPT4

Lukas N.P. Egger

VP of Product Strategy & Innovation @ SAP Signavio | AI Strategist | Product Discovery Expert | Thought Leader & Podcast Host

领英推荐

Lukas N.P. Egger的更多文章

社区洞察

其他会员也浏览了

How does GPT-4o measure up against its competitors?

?? LLMs on Fire ??

What is a Claude 3.5 Sonnet, and how does it compare to Gemini-1.5 Pro and GPT-4o?

Claude 2 vs GPT-4 in 2023: Comparing the Top AI Models

GPT-4 Cheat Sheet: What Is GPT-4, and What Is it Capable Of?

GenAI for Dummies

Tech Breakthroughs 2023 | Growth Navigator #5

OpenAI Introduces GPT-4o: Everything You need To Know

GPT4 Turbo vs. GPT 4o: Which New Model Is King?

What is GPT-4 and why should recruiters be excited by it?

领英推荐

Lukas N.P. Egger的更多文章

Making Sense fo LLMs - And my word shall be your command-prompt

Making Sense of LLMs - Artificial emotions and real product management.

Making Sense of LLMs - Questions, answers, and alpacas.

Making Sense of LLMs - A goal without alignment is just a wish.

Making Sense of LLMs - A picture worth 1,000 neurons.

Making Sense of LLMs - Data gets to be the new oil again!

Making Sense of LLMs - Where's the value at/add?

Making Sense of Large Language Models - An Introduction

How to De-risk Innovative Projects Across the Data and Analytics World for Powerful Process Transformation

社区洞察

其他会员也浏览了

How does GPT-4o measure up against its competitors?

?? LLMs on Fire ??

What is a Claude 3.5 Sonnet, and how does it compare to Gemini-1.5 Pro and GPT-4o?

Claude 2 vs GPT-4 in 2023: Comparing the Top AI Models

GPT-4 Cheat Sheet: What Is GPT-4, and What Is it Capable Of?

GenAI for Dummies

Tech Breakthroughs 2023 | Growth Navigator #5

OpenAI Introduces GPT-4o: Everything You need To Know

GPT4 Turbo vs. GPT 4o: Which New Model Is King?

What is GPT-4 and why should recruiters be excited by it?