ChatTuring vs. ChatGPT

Last year Blake Lemoine, a senior software engineer from Google, claimed that Google’s LaMDA chatbot had achieved consciousness. Google denied this, claiming that it is merely a large language model (LLM) that imitates word combinations it has encountered.?But, of course, imitation is how children start talking as well! Are LLMs close to becoming sentient?

OpenAI’s ChatGPT has garnered immense attention recently, and many of its users side with Blake Lemoine, despite defenses built into chatGPR to guard against just that.

The standard method for determining intelligence is called the Turing Test, named after Alan Turing’s 1950 seminal paper Computer Machinery and Intelligence. Turing provides specific questions that he would ask a chatbot, along with answers from an intelligent computer (which we shall call ChatTuring):

Q: Please write me a sonnet on the subject of the Forth Bridge.

A: Count me out on this one. I never could write poetry.

Q: Add 34957 to 70764

A: (Pause about 30 seconds and then give as answer) 105621.

Q: Do you play chess?

A: Yes.

Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?

A: (After a pause of 15 seconds) R-R8 mate.

I decided to submit Turing’s questions to ChatGPT to see how well they would fare.

The first question is ambiguous, and a human poet would be expected to ask whether a Petrarchan or Shakespearean sonnet was desired. ChatTuring, apparently afraid of sounding mechanical, diplomatically sidesteps answering. ChatGPT has no such qualms and produced a Shakespearean sonnet in less than 30 seconds:

Upon the coast of bonnie Scotland's land,

There stands a mighty feat of engineering,

A bridge of steel, so grand and bold and grand,

The Forth Bridge, to all it's domineering.

?

Its crimson hue against the blue expanse,

A symbol of the nation's strength and might,

A marvel of the human skill and chance,

That spans across the Firth, in sheer delight.

?

The trains that cross its span with mighty roar,

A testament to its enduring power,

A legacy that shall forevermore,

Remain a beacon in the nation's hour.

?

The Forth Bridge, a wonder to behold,

A structure strong and steadfast, brave and bold.

While “grand and bold and grand “ is rather awkward, ChatGPT displayed better than average rhyme and rhythm talent, as well as impressive knowledge not only of the Forth Bridge being a railroad bridge in Scotland and being an early major steel structure but also of its color (Forth Bridge Red) and its being a proud symbol of Scotland.

TuringChat takes 30 seconds to incorrectly answer the second question. Is it purposely trying to fool Turing? Or perhaps Turing is implying the "strong AI" hypothesis that posits that all true intelligence is equivalent, and thus a truly intelligent bot would calculate like a human, not like a computer.

Here is ChatGPT’s answer:

The sum of 34957 and 70764 is 105721.

It had been widely reported that ChatGPT misses carries in additions of three digits and higher. This flaw has apparently been corrected; ChatGPT now carries out arithmetic quickly and correctly.

The final exchange starts with “Do you play chess?” to which ChatGPT replies:.

Yes, I can play chess! As an AI language model, I don't have physical capabilities, but I can help you learn how to play, provide you with tips and strategies, and even play a game with you through text-based communication.

Fair enough. So, I asked Turing’s chess question, which is shrewdly ambiguous. It takes as obvious that the object is to win in as few moves as possible, something that a computer might not realize. It does not specify who is black and who is white. The statement “R at R1” leaves open whether the rook is at queen’s rook 1 (bottom left corner) or king’s rook 1 (bottom right corner). All of these would elicit questioning from a human chess player. Turing’s chatbot apparently tried all possibilities and gave a correct and equivalently ambiguous answer in a short, but not unbelievably short, amount of time.

ChatGPT’s answer is painfully verbose, and just wrong.

This is a classic chess puzzle known as the "King and Rook versus King" endgame. The goal of the attacking player (in this case, me) is to checkmate the opposing king (in this case, yours) using the rook and king.

In this position, I would play the following move: Rb1+

This move brings the rook to b1, checking your king at K1. Now your king must move, as it is in check. From here, I can use my rook to restrict your king's movements until I am able to deliver a checkmate.

When ChatGPT is out of its depth, it tends to go into a very convincing discourse about something else.

So, how well did ChatGPT do?

ChatTuring and ChatGPT are diametrically opposite. Turing’s bot is a laconic diplomat, can’t write poetry, and is a chess whiz. ChatGPT is a verbose politician, not a bad poet, and has a calculator in hand.

But the difference between ChatTuring and ChatGPT shouldn’t be a deciding factor. Some people are diplomats and some politicians. Nor should the existence of apps that can spot ChatGPT text sway us – the question is whether a human can.

One may readily recognize ChatGPT by its being too fast, too knowledgeable, and too proficient. Were a chatbot based on the same LLM have been designed differently, I venture that even Turing would be fooled.

So, I asked ChatGPT “Would you pass the Turing test?” to which it replied:

As an AI language model, I have been trained on a vast amount of data and can respond to a wide range of questions and topics. While I am capable of generating responses that can be difficult to distinguish from those of a human, passing the Turing test ultimately depends on the specific criteria and evaluation methods used by the human evaluator.

I couldn't agree more.

Peter Ashwood-Smith

Semi Retired at Consultant

1 年

Have a look at gpt4- there is a nice paper comparing it to 3 and the improvements are astounding.

回复

Very interesting. Given that the Turing test is out in the open, isn't it likely that an intelligent AI would fail it on purpose? ?? A few days ago, someone distributed a semi-humorous video he created with AI: A one-minute Rap song for the protests in Israel. It was funny, had a good rhythm, and included all the relevant messages. It also included this message, for a fleeting second, right in the middle of the song. Reminds me a lot of Darth Vader's argument for the Empire... This worries me a lot more than any potential test ??

  • 该图片无替代文字
回复

Very nice. Not as dramatic as Blade Runner, but non-fiction is not fiction

回复
Dan Shalem

Communication Consultant ? Workshop Facilitator ? Speaker

2 年

Give me a laconic diplomat over a verbose politician with a calculator any day!! Loved the article Yaakov Stein, thanks.

回复
Rami Yaron

Technology Leader

2 年

Very interesting Yaakov, maybe you should have started with the last question...

回复

要查看或添加评论,请登录

Yaakov Stein的更多文章

  • What the Internet can teach us about the past - Part 3 - my own grandfather

    What the Internet can teach us about the past - Part 3 - my own grandfather

    We are all used to searching the Internet for information about what is happening today. And without the Internet to…

    9 条评论
  • What the Internet can teach us about the past - Part 2 - the grammar book

    What the Internet can teach us about the past - Part 2 - the grammar book

    We are all used to searching the Internet for information about what is happening today. And without the Internet to…

    1 条评论
  • What the Internet can teach us about the past - Part 1

    What the Internet can teach us about the past - Part 1

    We are all used to searching the Internet for information about what is happening today. And without the Internet to…

    1 条评论
  • The real danger of AI

    The real danger of AI

    Many pundits are worried about robots taking over the earth, enslaving humans (believing them to be an inferior race)…

    8 条评论
  • Health effects of 5G millimeter waves

    Health effects of 5G millimeter waves

    Recently we have been hearing statements to the effect that there is absolutely no proof that exposure to low levels of…

    13 条评论
  • The disaggregated smartphone

    The disaggregated smartphone

    Part 1 – The Smartphone Smartphones are on almost everyone’s list of the greatest inventions of all times. In fact I…

    11 条评论
  • Maslow's Pyramid Today

    Maslow's Pyramid Today

    Abraham Maslow, in his 1943 paper "A Theory of Human Motivation" and later in his 1954 book “Motivation and…

    7 条评论
  • 5G – hypothetical hype and realistic reality

    5G – hypothetical hype and realistic reality

    Unless you live an incredibly sheltered life, you have probably heard a lot about how great 5G is going to be. And…

  • Control and Management Planes - a new interpretation

    Control and Management Planes - a new interpretation

    In three consecutive blog posts on RAD's site I present a distinctive way of interpreting recent trends in networking…

  • The physical cost of virtual currency

    The physical cost of virtual currency

    Virtual currencies based on BlockChain technologies, and in particular BitCoin, have become extremely popular. Most…

社区洞察

其他会员也浏览了