登录查看更多内容

How Generative Text Artificial Intelligence (Specifically GPT) Works Under the Hood, In Plain English, For Non-Nerds…

Orren Prunckun

发布日期: 2023年10月25日

+ 关注

GTP is a subset of Generative Text Artificial Intelligence, which is also known as a Large Language Model.

GTP stands for Generative Pre-trained Transformer.

I’ll break each element down to what it is and how it works.

A user inputs written words.

A Large Language Model (LLM), software understanding and generating human language, like GPT (each time I mention an LLM assume I am referring to GPT) splits those words into what are called tokens.

A token is a part of a word or sentence.

A token is approximately on average 0.75 words.

This is not to be confused with the use of the word token in crypto or in web apps!

LLMs then take the fractions of words or sentences (as tokens – assume every time I mention words or sentences, I really mean fractions of words or sentences or tokens) and look for the context 9surronding words) that it exists in.

LLMs understand the context (or sequence) that words exist in by using training data.

This is the Pre-training in the GPT acronym.

OpenAI’s GPTs initial training data corpus was from the likes of massive quantities of text from:

Common Crawl
WebText2
Books1 & Books2 (what they are currently being litigated on);
Wikipedia; and
Reddit (who has now tightened access via their API).

LLMs then compare and narrow down the types of words and patterns of words that surround (or their context it appears in) a user's input of written words.

LLMs do this by creating what is called a vector through a function of software.

A vector is a series of values (represented as a number) that depict the linguistic features of each word such as the meaning and intent of users' input of written words.

This is called a “word embedding” or “embedded vector”.

These numbers (embedded word vectors) allow LLMs to determine how close words are in proximity (or position) to other like words and their relationship to other words in similar sequences that were contained in the training data from before.

One can build a semantic search engine based purely on tokenization and embedded word vectors.

I have built several and this is the basis of an AI-powered chatbot.

Srikanth Machiraju 1 年前

GPT-3: Hype or Hyper-useful to the Globalization and…

Mia Zhang 3 年前

Head-to-Head: LLaMA 3, GPT-4, and Gemini

Anshuman Dubey 3 个月前

A keyword search engine – what you have most used to since the dawn of the search engine, simply searches for the occurrence of words in its dataset.

However, with a semantic search engine, a user can input a completely unknown word or sequence of words that doesn’t exist in the dataset and a very accurate response will still be generated.

This is why you can get away with spelling mistakes in your input from LLMs.

This works because tokenization and embedded word vectors look beyond a word or its incorrect spelling and look for linguistic features such as meaning and intent in the user's input.

This is the start of the Generative process in the GPT acronym.

A Transformer in the GPT acronym is what is called a parser.

In computer terminology, a parser is a software function that breaks down a sentence of input words (a string) into parts.

In doing so, the Transformer captures the intent, meaning and context of the entire user's word input, the closeness, proximity, and the relationship of each word to each other and what are the important parts of the sequence of words it parsed.

In other words, the Transformer looks for intent and meaning.

This is called self-attention.

Once the LLM has understood the user’s input by parsing it for its intent and meaning, it can then create a meaningful and relevant output to it.

Remember, through embedded word vectors, LLMs compare and narrow down the types of words and patterns of words that surround (or their context it appears in) a user's input of written words.

Based on this, LLMs try to predict the next word that will appear in the sentence it is outputting.

It predicts this through what is called a probability score.

“What is the probable likelihood of a word from a bank of possibilities being the next in the sequence of words?”

Because it is based on probability LLMs can get sentences wrong.

Not in how they are constructed, but in what they mean.

And because embedded word vectors are not the same as a keyword search engine, LLMS makes up facts (called hallucinations).

And that is the Generative part of the GPT acronym.

That is how GPT works.

The Chat function of ChatGPT simply allows you to add the previous question/answers to your input.

Woodley B. Preucil, CFA

Senior Managing Director

11 个月

Orren Prunckun Very well-written & thought-provoking.

Woodley B. Preucil, CFA

Senior Managing Director

11 个月

Orren Prunckun Very insightful. Thanks for sharing.

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

How Generative Text Artificial Intelligence (Specifically GPT) Works Under the Hood, In Plain English, For Non-Nerds…

Orren Prunckun

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Enhancing RAG-Based Solutions with Intelligent Context Retrieval

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Retrieval Augmented Generation (RAG) overview

RAG: A Journey from Simple Query to Complex Narrative

Fine-Tuning Florence-2 Base Model on a Custom Dataset for Image Captioning

DBRX: A New State-of-the-Art Open LLM

Cracking the Code of GPT: It's Not Intelligence, It's Trained Prediction

Learn Everything About GPT-3 In Less Than 3

“Modal hints” for ManaGPT: Better AI text generation through prompts employing the language of possibility, probability, and necessity

领英推荐

Why A New Conflict In The Middle East Could Send Shockwaves Through Global Markets...

2024年10月3日

The SEO Shift No One Saw Coming…

2024年9月13日

"Strategic Moves Catapulted This GPT To The Top Of OpenAI's Charts!"

2024年1月11日

The Common Mistake Made By Most Content Creators...

2023年11月29日

This Is What Makes The New Ice Arena Better Than The Majority Of Adelaide Brands…

2023年11月29日

This Is The One Thing That 99% Of Prospective Clients Do And How You Can Learn From It...

2023年11月29日

From Buddy to Billboard: How A Constant Pitcher Turns Chatter Into Ad Blindness...

2023年11月29日

How Exactly Does Instagram Rank & Deliver Reels & How I Use That for My Advantage?

2023年11月28日

What Everyone Else Isn’t Telling You: An Insider’s Guide OpenAIs Custom GPT Assistants…

2023年11月8日

This Is What A Website Should Be...

2023年10月23日

社区洞察

其他会员也浏览了

Enhancing RAG-Based Solutions with Intelligent Context Retrieval

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Retrieval Augmented Generation (RAG) overview

RAG: A Journey from Simple Query to Complex Narrative

Fine-Tuning Florence-2 Base Model on a Custom Dataset for Image Captioning

DBRX: A New State-of-the-Art Open LLM

Cracking the Code of GPT: It's Not Intelligence, It's Trained Prediction

Learn Everything About GPT-3 In Less Than 3

“Modal hints” for ManaGPT: Better AI text generation through prompts employing the language of possibility, probability, and necessity