ChatGPT: New Enlightenment or Intelligence Theater?

ChatGPT: New Enlightenment or Intelligence Theater?

Everyone I know seems to want to talk about ChatGPT lately.?The opinions range from fear to contempt, to irrational exuberance. At the apogee of hyperbole (I hope), Henry Kissinger, Eric Schmidt, and Daniel Huttenlocher have now declared that

A new technology bids to transform the human cognitive process as it has not been shaken up since the invention of printing.”?

They go on to say,

Whereas the printing press caused a profusion of modern human thought, the new technology achieves its distillation and elaboration.”?

Generative artificial intelligence presents a philosophical and practical challenge on a scale not experienced since the beginning of the Enlightenment.

?“Generative AI is similarly poised to generate a new form of human consciousness.

It’s not entirely clear how they come to these conclusions regarding the purported profound impact of large language models, but I expect that they will be severely disappointed.

Near as I can tell, their enthusiasm stems substantially from the idea that these large language models can distill and elaborate a gigantic body of “knowledge” in ways that, they say, humans cannot.?They seem to attach great significance to the idea that the models generate answers rather than copying them, that they can prioritize billions of data points to select a set of words that appear to the human reader to be relevant.?They repeat several times that the text that the models produce appears within a few seconds.

In short, they appear to fundamentally misunderstand how these large language models work and what they do.?First large language models are just that—models of language. ?As Claude Shannon pointed out in 1948, words in discourse can be predicted.?The more context one uses to make the prediction, the better the predicted content approximates human-written English prose.?Shannon used a context of one, two, or a few words to predict the next word.?The large language models use thousands of words. ?GPT-3 uses a context of 2048 tokens, chatGPT uses a context of around 4,000 tokens.

The distillation that these models engage is not a synthetic compilation of ideas, but a statistical summarization of word frequencies.?If we consider a vocabulary of just 100,000 words (the estimated size of an adult human’s vocabulary, small for one of these large language models), then there are 100,000 E 2048?(10 E 10240; that is: a 1 followed by 10,240 zeros) possible input patterns for GPT-3 to learn. ?There are 100,000 ways to pick the first word, 100,000 ways to pick the second word, and so on. GPT-3 has 175 billion parameters (1.75 × 10 E 11), which is a miniscule fraction of the possible patterns, so multiple patterns share parameters. Sharing is how the models distill that language. ?

ChatGPT also was modified by submitting prompts and responses to human judges (supposedly in Kenya) who gave it feedback about the appropriateness of its response to each prompt.

Fortunately, words are not independent of one another.?The word “nurse” is much more likely to occur in a context with a word like “doctor” than a word like “elephant.”?As a result, the set of parameters representing “nurse” are going to be more similar to the parameters representing “doctor” than they are to the parameters representing “elephant,” because the parameters are derived from the context. The more often two words appear in a similar context, the more similar will be their representations.??

The models learn by guessing a missing word when given a context.?Each word in the context, then constrains the probability of producing each other word.?Because of the shared parameters, word substitutions occur.?These new words constrain the probabilities of subsequent words, and so on, so that (potentially) novel sentences can be produced that still preserve the apparent appropriateness of the words in the text.?The models excel at producing fluent, plausible-sounding text, but all of the distillation is at the level of word probabilities, nothing deeper.?

A large language model is just a statistical model of past language.?How the model produces language is complex in the sense that it involves a lot of computation, but simple in principle. ?There is nothing mysterious about it.?I don’t see how one gets from a statistical model to a revolutionary transformation of the human cognitive process.?What Kissinger and his colleagues seem to believe, is that these models generate their text from a knowledge model.?That is not the case.?The model distills the word patterns, it knows nothing of the ideas behind those words.?Kissinger and his colleagues seem to partially understand this distinction, they say, “Even though the model is incapable of understanding in the human sense, its outputs reflect an underlying essence of human language.”?But still, they think that these models will have profound effects on diplomacy and human cognition.

The generative large language models are a modified implementation of the infinite monkey theorem. ?Imagine an infinite number of monkeys typing at an infinite number of keyboards.?Eventually, they will produce the entire corpus of human literature. ?Now, instead of monkeys typing randomly, the GPT monkeys type words proportionately to how often they occur with (2047) other words. ?The words are still typed by monkeys, but now, it will take much less time to actually produce text that a human would recognize as coherent and fluid. ?But no matter how fluent the monkeys are, they are still just monkeys.

Large language models are the Seinfeld show of artificial intelligence.?They are about nothing.?They merely reflect the pattern of word usage that they have been exposed to. ?Kissinger and his colleagues note that it is difficult to distinguish the truth from misinformation in the productions of large language models.?Google and Microsoft have both announced similar models and both have been embarrassed during their rollout. ?The models string together words that look plausible but have no basis in fact because the models do not have any access to or representation of facts.?All they “know” is descriptions of the facts.?They have no way to identify lies, expressed, but unfulfilled intentions, or mistakes, because they have only the language patterns to work with.?They have to be told that a statement is a lie because they have no other way to identify it.?

Finally, Kissinger and his colleagues note that “ChatGPT possesses a capacity for analysis that is qualitatively different from that of the human mind.”?They are right about this, but I think that the qualitative difference is in the opposite direction from what they imply.?These models are exquisite at representing language patterns, but they are utterly helpless at identifying fact patterns.?Plato, in his allegory of the cave, decried the fact that most people do not engage with the world, only with shadows of events in the world.?In the case of large language models, they do not even engage with the shadows.?They engage nothing more than the descriptions of those shadows.

It is easy to be misled by these large language models.?It is tempting to mistake language for knowledge. ?Russell Crowe did not become a mathematical genius by playing one in a movie.?He memorized, and maybe improvised his lines.?He may have spoken like a genius, but that did not make him one.?

Computers do not have to solve problems in the same way that humans do, but it is then a mistake to attribute the means by which humans solve the problems to the computers using different methods.?For many uses, performing like a genius is enough.?Machine translation, for example, works quite well without having to know what the words mean, but it would be a fundamental error to reason backwards and claim that the accurate translation means that they have a theory of meaning.?Large language models do not solve intellectual problems in the same way that people do, so then it is erroneous to attribute anthropomorphic properties to them.

In contrast to the concerns of Kissinger and his colleagues, that these models create “a gap between human knowledge and human understanding,” they actually suffer from the distinction between knowing and saying.?Large language models are great at dredging up something similar to something that someone has written before.?In a sense, they are intelligence theater where they strut and fret their hour upon the stage.?Whether they signify anything, let alone a revolution in human cognition, remains to be seen.


Alex Medana

FinTech CEO I Repeat Entrepreneur with 1 Exit (DLT, Digital Identity, Tokenisation since '15) I Board Member I Adviser & Coach

1 年

thank you Herbert Roitblat for deflating the noise bubble ??

回复

Excellent article. To equate probabilistic prediction of the next word based on a context size of 1024, 2048 or 4096 tokens to human cognition is absurd. As a complement, a good read is the interview below in the MIT Technology Review with some of the developers of ChatGPT. They are clearly humble and realistic regarding the capabilities of the technology: https://www.technologyreview.com/2023/03/03/1069311/inside-story-oral-history-how-chatgpt-built-openai/

Stefan W.

CEO/Founder at JANZZ.technology - JANZZilms! (the world's leading solution for PES), JANZZon! (Ontology), JANZZsme! (Semantic Matching Engine)

2 年

Fantastic article. Explained in just such a way that there is a real chance that also people from marketing and management, VCs and investors, consultancies and software companies and the whole current AI circus at best finally understand what ChatGPT is ultimately about or not. All those, who after reading this post continue to claim that they understand something about generative AI and continue to make these recurring BS posts on every conceivable platform and news channel that the future has now begun and everything will change through ChatGPT (probably written from their self-driving cars and who were also convinced years ago that it would then really soon be made possible...), thus only conclusively prove that they simply have no idea about the subject. Really absolutely no idea at all. Thanks for this well-founded and relevant post!

"Ignorance is like a delicate exotic fruit; touch it and the bloom is gone." - Oscar Wilde See what you've just done?

Walid Saba

Senior Research Scientist

2 年

Great article. Thank you.

要查看或添加评论,请登录

Herbert Roitblat的更多文章

社区洞察

其他会员也浏览了