登录查看更多内容

GPT-3, the most complex AI in the world: Finally (slightly) useful

Adam Hede

Senior Generative AI Consultant at Implement Consulting Group

发布日期: 2020年12月11日

A few days ago, I was invited by OpenAI to test their new super-massive natural language model. If you are new to GPT-3, it is a much-hyped model able to write human-like language given a prompt. It has caused a big fuzz in the media (link, link, link), due to the sheer quality of the responses making it hard to distinguish from human writing. This has caused concern ranging from spam to fake news but also much excitement.

Disclaimer: The license term of the invite forbids me from sharing specific input/output publicly. Throw me a message if you have any specific questions and we’ll figure something out ??

Briefly, on the mechanics of GPT-3

What GPT-3 does is amazingly simple, and it’s important to understand both the limits and why it is so impressive.

GPT-3 is trained on a large chunk of the raw text off the internet. Fed word for word, it predicts the next one. Like if you keep pressing the middle of a predictive keyboard, in technical terms it is an autoregressive model. What GPT-3 does is that it is so massive it is able to consider a lot (thousands) of previous words and their incredibly complex interactions and meanings to produce the most plausible next word, and the word after that, and the next again and so forth.

That is all. No external databases, no scripted rules.

Test 1: Zero-shot learning on hard NLP tasks, aspect-based sentiment analysis

It has been documented GPT-3 can work as a sort of “universal API”. Hand-write an input and an output and then a second input and it will often try to produce a second output, following similar rules from input-output set number one.

I created a little aspect-based sentiment analysis test, writing a complicated text about Implement (my employer) and myself with multiple sentiments associated to different aspects and then wrote:

Aspect: “Implement”. Sentiment: “Positive”

Aspect: “Adam”. Sentiment:

And hit auto complete. It is by no means perfect, but in upwards of 70% of cases it gets the right response (positive or negative). Give it a few more examples of both texts and aspects and that goes higher. Another fun observation was that it was eager to identify other aspects and sentiments in the text as well (all correct, but not what I was going for.

This can be useful. As a zero-shot baseline, for a complex task, it is very good.

Of note is that it also performed well in danish. Telling me that there is an implicit translation happening somewhere inside GPT-3. It might have learned to generalize across languages.

Test 2: Knowledge intensive interviews

With help from a couple of experts I know, I sat up knowledge intensive interviews with GPT-3 around topics such as homelessness policy in Denmark, supply chain digitalization, the interaction between democratization and foreign direct investment as well as data modelling and data strategy. And finally, Scottish whisky distilleries!

I would write an intro-header for the system, letting it know the subject and they kind of expert I expected the system to be, and then conducted an interview around the topic.

At a high level, GPT-3 produces good, believable, and even factually correct texts. Especially when asked on opinions (“what are the three most critical areas of data strategy”) it did exceedingly well (my consulting friends should start watching out ??).

When pushed further, like into the specifics of individual danish municipalities policy on homelessness, it mostly refused to answer. This – to me – is a healthy sign, instead of producing outright lies.

When push came to shove (“what is the homelessness policy on Mars?”), it was however willing to produce complete fabrications.

Test 3: Actual valuable work

Finally, has GPT-3 been useful?

I am surprised to say: YES!

It’s not much, but in the few days I’ve had it, I’ve used it twice to produce actual valuable help.

One case was to generate interview questions. I needed 8-10 questions and was quick to hammer out the first two. I provided that to GPT-3 along with a headline for the interview, and it happily produced seven additional questions. Three of these where on topic and good, so I kept them, removed to rest, and hit the API once again, now with five reference questions. This pattern of using “cleaned” results from the API to get more of the content I look for became common.

Fundamentally, it was actually helpful at brainstorming.

A second case was when I needed a list of random company names. They could even be fictional, I just needed company-sounding-names to train a basic algorithm. Given a short of companies, GPT-3 happily supplied an additional 100 examples.

Final verdict

I am very impressed, but there are still key challenges. GPT-3 is good, it is even very good. It can supply reasonably correct explanations and facts on a wide variety of topics at a high school level. This is a major achievement, and it will take time before such a technology finds it place in our society (for a very cool example check AI Dungeon or AI|Writer).

On the flipside, the core issue with GPT-2 is not fixed with GPT-3. It does not know what it’s doing. There are no guarantees, and with insufficient context I’ve had it both be comically wrong (it went on a tangent about pizzas when I tried to emulate a job interview) and downright nasty (the amount of erotica this has consumed must be truly staggering). This riskiness of these issues currently severely limits the usability of GPT-3.

If you are interested in knowing more about the technology, potential and limits, don’t hesitate to shoot me a message J I’m always happy to chat!

要查看或添加评论，请登录

Adam Hede的更多文章

AI Predictions for 2025: A probabilistic look

2024年12月30日

AI Predictions for 2025: A probabilistic look

Predicting AI developments in 2025 is a challenge, as the most important thing to know is that we really don’t know…

13 条评论
Generative AI, sommerl?sning :)

2023年7月4日

Generative AI, sommerl?sning :)

Der har v?ret vildt at f?lge med AI-udviklingen de sidste 215 dage (plus-minus-det l?se). Med sommerferien p? vej der…

4 条评论
Data Science hos Implement: Governance-v?rkt?jer

2020年5月7日

Data Science hos Implement: Governance-v?rkt?jer

I forrige artikel beskrev jeg forholdsvis omfattende de overvejelser man b?r g?re sig n?r man starter p? at udvikle sin…
Data Science hos Implement: Deployment & Governance

2020年4月29日

Data Science hos Implement: Deployment & Governance

F?r, at en data science l?sning kan give v?rdi, s? skal den bruges. Det betyder i langt de fleste tilf?lde, at…
Data Science hos Implement: Etik

2020年4月14日

Data Science hos Implement: Etik

Inden for data science fylder etik rigtig meget for tiden, og med god grund. N?r vi arbejder med data science g?r vi…

3 条评论
Data Science hos Implement: Machine learning

2020年4月2日

Data Science hos Implement: Machine learning

Denne artikel er en del af en serie af artikler om data science arbejdet ved Implement Consulting Group, i forbindelse…

1 条评论
Data Science hos Implement: Definition af form?l og 3 tips

2020年3月19日

Data Science hos Implement: Definition af form?l og 3 tips

Hvis der er en ting, jeg har l?rt gennem mit data science arbejde i Implement, er det vigtigheden af at v?re tydelig…
Data Science hos Implement: 5 tips til dataeksploration

2020年3月11日

Data Science hos Implement: 5 tips til dataeksploration

Dataeksploration er formentlig en af de mest undervurderede data science-discipliner. Man skal nemlig ikke undervurdere…
Data science hos Implement: Identifikation af use cases

2020年3月5日

Data science hos Implement: Identifikation af use cases

Hvordan sikrer vi, at vores data science-l?sninger rent faktisk g?r en forskel? Det er lettere sagt end gjort, da data…
Hvad er data science? (Og hvorfor laver vi et kursus?)

2020年2月26日

Hvad er data science? (Og hvorfor laver vi et kursus?)

I mandags gik hjemmesiden live for Implements nye data science-kursus. Et omfattende 13-dages kursus i Python, machine…

2 条评论

See all articles

GPT-3, the most complex AI in the world: Finally (slightly) useful

Adam Hede

Senior Generative AI Consultant at Implement Consulting Group

Briefly, on the mechanics of GPT-3

Test 1: Zero-shot learning on hard NLP tasks, aspect-based sentiment analysis

Test 2: Knowledge intensive interviews

Test 3: Actual valuable work

Final verdict

Adam Hede的更多文章

社区洞察

其他会员也浏览了

Lets build a GPT style LLM from scratch - Part 2a, quick intro to Transformer and self-Attention

Lakrobuchi! GPT-2, Artificial Intelligence, Dadaism, Literature, Fake News and the Rest

Custom GPT from the GPT store creating a very basic AI video

LLMs vs. Reasoning AI: Why Chatbots Sound Smart but Can’t Think

DeepSeek-R1: AI's New Frontier – Goldmine or Mirage?

AI-Powered QE: Merging Machine Intelligence with Human Insight

The big implications of GPT-3 text generation

Brilliance Bias in GPT-3 (Chat GPT!!!)

Vector Databases for Efficient Data Retrieval in RAG: A Comprehensive Guide

Inside Transformers: A Comprehensive Dive into Transformer Architecture

Briefly, on the mechanics of GPT-3

Test 1: Zero-shot learning on hard NLP tasks, aspect-based sentiment analysis

Test 2: Knowledge intensive interviews

Test 3: Actual valuable work

Final verdict

Adam Hede的更多文章

AI Predictions for 2025: A probabilistic look

Generative AI, sommerl?sning :)

Data Science hos Implement: Governance-v?rkt?jer

Data Science hos Implement: Deployment & Governance

Data Science hos Implement: Etik

Data Science hos Implement: Machine learning

Data Science hos Implement: Definition af form?l og 3 tips

Data Science hos Implement: 5 tips til dataeksploration

Data science hos Implement: Identifikation af use cases

Hvad er data science? (Og hvorfor laver vi et kursus?)

社区洞察

其他会员也浏览了

Lets build a GPT style LLM from scratch - Part 2a, quick intro to Transformer and self-Attention

Lakrobuchi! GPT-2, Artificial Intelligence, Dadaism, Literature, Fake News and the Rest

Custom GPT from the GPT store creating a very basic AI video

LLMs vs. Reasoning AI: Why Chatbots Sound Smart but Can’t Think

DeepSeek-R1: AI's New Frontier – Goldmine or Mirage?

AI-Powered QE: Merging Machine Intelligence with Human Insight

The big implications of GPT-3 text generation

Brilliance Bias in GPT-3 (Chat GPT!!!)

Vector Databases for Efficient Data Retrieval in RAG: A Comprehensive Guide

Inside Transformers: A Comprehensive Dive into Transformer Architecture