Is ChatGPT4 a Ouija Board With Memory? Stop the Parlor Tricks
Not so long ago LLMs and Transformers were seen on the threshold of a Super Intelligence breakthrough, posing an existential risk to all humanity and civilization. Models could succeed in taking Bar Exams, writing software, solving math problems, composing poems, screen plays and marketing material. LLMs, especially ChatGPT4, seemed to be imbued with a level of intelligence that surpassed that of mere mortals.
?
Well, perhaps not. Experience is starting to lay bare multiple weaknesses, and perhaps, even a fatal flaw. Doubtlessly, something is there, but they are not intelligent. Full stop. We are starting to see them for what they are: humongous collections of texts automatically indexed in clever multidimensional ways that mimic the structures of input texts, which in turn are curated (“through reinforcement learning”) by humans to provide “credible” answers to queries (“prompts’). Unlike any intelligent living thing, they lack sensors and the capacity to act on those sensors and form abstractions and predictions about those things to stay alive. ?They are inert corpus bound monolithic blobs of weighted indices that “spring to life” (generation) through queries that probabilistically match stored templates.
?
Perhaps an apt metaphor for LLMs is a Ouija Board with stored memory of repeated use. A Ouija Board “works” when multiple persons place their fingers on a planchette, ask questions and the planchette magically moves around a board of letters to spell out answers. Repeated sufficiently, the Ouija Board provides the right answers!
?
My disenchantment with ChatGPT4 began with the failed expectation that it could achieve two simple and common tasks: 1.) generate a logo. 2.) look up information on people. I was really excited about the former as I thought it might be a quick and cheap way to generate a logo for a nonprofit I am a member of: Androscoggin Valley Energy Collaborative. ?I provided the following prompt: “The name of the nonprofit is the Androscoggin Valley Energy Collaborative which uses local sustainable energy from solar, wind, hydro and biomass to provide secure and low cost electricity to the citizens for Coos County New Hampshire. The logo might mimic the State of New Hampshire logo and reference solar, wind, hydro and biomass energy generation.” These are two of four logos it produced after many attempts I made to correct the spelling:
?
?
领英推荐
?
Though I found the design of the logos interesting, I was struck by the liberties Dall-E 2 took in changing the spelling of all words to fit the design. After many prompts for correction, it was clear that Dall-E 2 did not know how to spell or have even the slightest idea what a word was. ?Perhaps there is a work around but that was beyond my parameters of patience.
?
The second fail was different but to me even more disturbing. A friend of mine demonstrated the voice activated version of ChatGPT4 by asking it to say what it knew about me. It came back with a well-constructed and laudatory description of some of my work to impress myself and small attentive group. We asked it about other known persons, and it responded in similar manner, again highlighting accomplishments, but providing only a partial history. This was repeated several times with my own implementation, but then at some point, I and all the other people I had queried were no longer known to the program. It replied that it had no information, and that we were not sufficiently public persons to be known to the program. Aside from the slight, it was remarkable that unlike a Google search, the results of the search did not at least partially match prior searches and had failed altogether. It went inexplicably dumb on me. This is a huge no-no in the engineering of any device. When there is no consistency – or even a probabilistic consistency in the translation of an input to an output, then that devise is essentially worthless. Imagine a probabilistic keyboard or compiler where you never knew for sure what the result would be and why it would differ.? This, in my opinion, is not a small failure but a fatal flaw. By having such primary input-output mappings be probabilistic and randomly intervened by human curators, there is no shared assurance as to the result of simple queries. The second fatal flaw, as demonstrated by the logo example, is that it has no “abstract” or semantic representation of words and phrases. They simply are images that occupy a space. Hence, unlike a human designer who would never make such errors, the Dalli-E- 2 invents images to fulfill a spatial constraint.
?
Conclusion: ChatGPT4 is not intelligent nor any evidence of an emergent or budding intelligence. By its inconsistency and opaqueness, it is ill suited for even many simple tasks.
?
?