If it walks like a duck and quacks like a duck, maybe it's an artificial duck
Who's behind the curtain? Human or Machine?

If it walks like a duck and quacks like a duck, maybe it's an artificial duck

For several years leading up to 1950, there was considerable debate about what it means for computers to be intelligent and on the differences between human versus artificial intelligence. There was a lot of angst based on interpretations from theology, mathematics, consciousness, etc.

Then Alan M. Turing came along and designed a brilliant test in 1950 to bring a pragmatic perspective to the debate. Briefly, the test is a special-purpose, sophisticated variation of the "if it walks like a duck and quacks like a duck, it is a duck" test.

Turing made so many seminal contributions in this field that the Turing Award in computer science is named after him. It is the highest award in the field and is considered equivalent to the Nobel Prize (which isn't awarded in computer science).

The original test as proposed by Alan Turing was limited to text-based "conversation" between two entities, one a human and the other an unknown entity behind a curtain (or in another room). The human asks questions to the hidden entity and receives answers. The text-based constraint was to prevent non-essential factors from clouding the real purpose of the test, such as the ability to speak and convey emotions.

If the human cannot consistently identify the other party in the conversation as a computer, then the computer, for all practical purposes, can be said to demonstrate intelligence. Such a conclusion would be limited to the particular context in which the conversation (or test) was conducted and would not be generalizable.

Melville Carrie proposed a modern variation of the Turing test in a post that generated some lively debate:

"Get a computer to "watch" a blockbuster film; and then it must write review (suitable for publication in Film or Empire or Guardian, New York Times, etc.), and if a panel of humans think it's human written then it passes the test." [sic.]

Some of these conditions are problematic. Remember that the intent of the Turing test is to determine if a computer (or more accurately, the algorithm or program) can produce an "output" that establishes reasonable doubt in the mind of the evaluating human that this output could have been produced by another human.

Think of it as a legal battle. The human is trying to prove that the other entity is not a human. Any reasonable doubt would benefit the computer.

The Turing test is very careful not to introduce into consideration any actions that lead to the output or to insist on the output being expressed in a form or format in which humans have a biological advantage. Hence, Alan Turing insisted, given the limitations of computer generated speech at that time (which would have been a dead give-away), that both human and computer only communicate through teletype (a form of expression in which neither have any appreciable advantage).

"Watching" a movie just as a human would watch has nothing to do with "intelligent" behavior (it is merely "biological" behavior). A dog or a human baby could watch a movie, but that hardly proves intelligent behavior. An adult human could watch a movie in an unknown foreign language, but that too has no bearing on intelligent behavior.

Why must a computer "watch" a movie? Why could it not scan the audio-visual files instead? If it must "write" a review, do we insist that it must written just as a human would, say for example, by typing it up on another computer?

Must the computer go to a movie theater? Must it munch popcorn and drink soda while watching the movie?

Satisfying all those conditions wouldn't prove or support "intelligent" behavior, just "human" behavior. The concept of intelligence is only applicable to the content of the output (i.e., not how that content was produced and not to the form or format of expression of that content), just as skin color, race, and gender are non-essential to human intelligence.

Melville's test certainly poses very interesting challenges for computer technology. These challenges are not invalid per se, just invalid from a "proof of intelligence" perspective.

The output of the test should also be well-defined and well-formed. Why? Think about it like this: How many people could conclusively pass Melville's test? I couldn't write a high-quality movie review fit for publication in the NY Times. Could you?

This brings us to another requirement of the Turing test: the Turing test must also be "falsifiable" in the sense that a human could fail it, thus producing a false negative (i.e., the evaluating human would conclude that the result was not the output of a human whereas it really was a human who produced it poorly but not deliberately so).

For example, assume I sit behind the screen in the original Turing test. If the evaluating person types in questions in French - a language in which I can only type back some very rudimentary phrases - the human might conclude I was a computer. Or, if the human poses questions about football in English, I wouldn't be able to hold a half-way intelligent conversation from the other side of the screen.

In a Turing test which is very tightly scoped, I'd fail those two tests. But to conclude that the entity (i.e., me) behind that curtain is unintelligent would be an improper generalization.

We demonstrate this bias of improper generalization all the time. If we encounter someone from a different country and detect an accent, we assume they are somehow hard of hearing and probably dumb. We speak a bit louder and we might even adopt a fake accent, as if that would help them understand. If they can't respond quickly, carry on an idiomatic conversation, appreciate our jokes, or know all the facts about our culture (usually sports), we assume they are somehow dumb.

This bias extends to our attitude about computers and artificial intelligence. If the computer beats us in one area, we add a proviso designed to show that the computer is not "really" intelligent.

For example, if the computer beats us in chess, we say, "Maybe so, but it can't detect mood and sentiment in the written word."

If then the computer demonstrates that it can in fact reliably detect sentiment in the written text (using for example, Watson's sentiment analysis), we say, "Maybe, but it couldn't possibly detect emotion in faces."

Then comes along a company with AI-based algorithms that reliably isolate faces and detect emotion in videos. Then we counter with, "Maybe, but the computer wouldn't be able to appreciate poetry."

I'm confident that we could find some interpretation of the word "appreciate" that'd demonstrate intelligent appreciation by the computer.

(As an aside, let me point out that detecting emotion and sentiment, long hailed as the unassailable bastion of human intelligence, is surprisingly easy to mimic. It turns out that there are a manageable number of combinations of words, phrases, tones, and facial expressions that convey emotions, mood, and sentiment.)

"Yes," we retort. "But computers wouldn't be able to compose music like Mozart did." Sorry, there are algorithms that can learn the style of specific composers and produce short pieces in that style.

Every accomplishment by computers makes some people move the goal posts again and again to try to prove that computers will never equal humans in intelligence.

To the extent that such increasingly advanced tests push us to innovate and add more sophisticated capabilities to computers, that is a good thing.

But I do think that a subset of critics (hopefully only a small subset) have an anti-computer ax to grind.

Alan Turing was well aware of this. His test was designed to isolate the pragmatic aspects of the issue and to ignore philosophical speculation.

I think that a true Turing test will only assess the content of the interaction, and not the behavior that produced it (unless the behavior itself is the subject of the test), the way the result is expressed (unless the expression itself is the subject of the test), or the "intent" or "what's inside" (in which case, it would not be a provable test).

Perhaps the ultimate Turing test is to have a computer design a true Turing test at which the computer that designed it will itself fail.

Javid Ur Rahaman

Head of Enterprise Architecture and Engineering

6 年

great article

great article sir !!!!!

Neil Oschlag-Michael

AI ? Governance ? Strategy ? Data

7 年

Actually you can combine the two and its already been done. Take a look at this video and let me know what you think: https://www.bbc.com/news/av/technology-34279121/erika-the-talking-robot. Here are my (obvious) assumptions 1) You break the problem into 2 or 3 components: 1.1a) seeing the movie 1.1b) preparing the review 1.2) using the review as a backdrop for the Turings test 2) Even if 1a, 1b and 2 are executed on different devices they can be grouped together and conceived as one intelligent “being” 3) Its far more difficult to use the review in a Turing test than simply writing and publishing it

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了