登录查看更多内容

Turing test and Falsifiability

Anand Kumar Keshavan

Principal Engineer at Postman| Computational Linguistics | Generative AI| Formal Methods | Author | Musician| Neurodivergent

发布日期: 2014年6月19日

The "Turing test" came back into news recently with Eugene Goostman, a computer program that managed to convince a team of judges ( at the Royal Society London- no less) that they were conversing with a 13 year old boy. Here is the full story.

Well, what exactly is the Turing test? According to Wikipedia it is as defined as follows:

"The Turing test is a test of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. In the original illustrative example, a human judge engages in natural language conversations with a human and a machine designed to generate performance indistinguishable from that of a human being. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test".

That is clear enough.

But is this a good enough test to ascertain whether a scientific hypothesis holds? A few years before Turing suggested this test, the philosopher Karl Popper had proposed the key idea of "falsifiability" as the fundamental property of any scientific hypothesis and the one distinguishes the "scientific" from the "unscientific". In the the post- war era , scientific community has widely accepted Popper's idea of "falsifiablity" as the criterion that distinguishes science from "non-science"..

Again from Wikipedia:

"Falsifiability or refutability of a statement, hypothesis, or theory is an inherent possibility to prove it to be false. A statement is called falsifiable if it is possible to conceive an observation or an argument which proves the statement in question to be false."

In the context of AI the Turing test is the one that determines the falsifiability of the hypothesis that a machine ( or a computer program) possesses intelligence. Is the test good enough to assert such falsifiability?

Nope.

Let us revisit the classic example - the case of the "black swan". The assertion "All swans are white" is falsifiable or refutable because all one needs to provide is the evidence of the existence of black ( or any other color) swan. One may go on looking at millions of swans which are white, all it needs is one black swan to make it false. The test here is the evidence of the existence of a black swan. Since such a test exists the assertion " All swans are white" is said to be falsifiable. ( A digression: Software testers who are under pressure to certify a piece of software as bug free face the possibility of the black swan. Millions of test successful test cases cannot guarantee the absence of bugs. And hence, to paraphrase Djikstra, testing can only detect the presence of bugs not their absence.)

Now, let us examine the assertion " Computer program a possesses Intelligence". The Turing test is supposed to provide the falsifiability criteria. If n number of judges feel that they are interacting with another human being then the test passes and even if one feels otherwise the test fails. Hence , the assertion is falsifiable and therefore "scientific".

Is it really so? The Turing test relies on the ability of people to distinguish their interactions with a program from their interactions with a human being. Stripped of the everything, it basically relies on the opinion of the judges to ascertain whether the test passes or fails. Hardly a good way to collect empirical evidence.

If the test required to ascertain the truth or falsehood of the assertion were to be redrafted along the lines of the Turing test, then it would be as follows:

The assertion " All swans are white" is true if all the people who observe swans feel that the swans are white. People who feel that a black swan is white would present evidence in the statements favor and people who feel that a white swan is black will find evidence against it. Now how absurd is that?

If the Turing test is the only test to determine whether the hypothesis about AI is falsifiable, then one can safely conclude that the hypothesis is not falsifiable and hence they isn't "scientific"!

We definitely really need better tests for determining whether a machine or a program possesses AI.

Krishnan Subramanian

Principal Architect at CodeBrew Technologies

10 年

Tantalizing bit of news, makes one sit up again to the possibility of the Turing Machine realized. I doubt if there was a breakthrough in defining a mathematical model for a "hard AI" given that whats out there today in the public domain is best classified as work-in-progress "intelligent-Agent" type AI models. Without such a model a meaninful falsifiability test cannot exsist too. Hence agree with your statement that the current Turing Test criterion (of 33% of people thinking that the machine is human) is not scientific. But the problem is not the test criterion but the lack of the model itself.

Shamasis Bhattacharya

Fellow @ Postman Inc. | DevTools / DevTech / APIs / Product Incubation / Research

10 年

The discussion pushes us towards some basic allusions in metaphysics. The question here is "does a machine possess intelligence"? The very question has two identifiable systems - (1) machine and (2) intelligence. Both are adjective qualities ascertained to physical or abstract forms. The very definition of the two systems are murky, when we consider physical form humans to be part of the reference frame. Are humans a form of machine? If so, then from the theory of Binary Opposition or Deconstruction, we might need to revisit what we are evaluating here and like the uncertainty principle, is it even really possible for humans to be an accurate judge of "intelligence" of another form?

1 次回应

Puneet Kuthiala

Project Management Consultant | Author of Warrior's Quest - a trilogy inspired by Sun Tzu's Art of War | Author @ JustPMBlog

10 年

I am not an AI person nor do I understand this world, but I must say that its an interesting and logical discussion. I tend to believe that the computers might have crossed the threshold of Turing Test, we still have to go a long way in terms of infrastructure, hardware and software design to see thinking machines as a reality.

查看更多评论

要查看或添加评论，请登录

Anand Kumar Keshavan的更多文章

History of Computing - a collection of linked in posts from 2021

2021年12月24日

History of Computing - a collection of linked in posts from 2021

1. Aug 2021 – The origins of the word “bug” The word "bug" was used by Grace Hopper to describe the malfunction of the…

3 条评论
Who is afraid of Immutable data?

2018年8月10日

Who is afraid of Immutable data?

Modelling domain types as immutable elements ( which, in all probability are, in real life) has great benefits. It…

1 条评论
Bringing testability into focus

2018年7月23日

Bringing testability into focus

Amongst the many "non-functional" attributes of software - scalability, availability, extensibility and so on-…

1 条评论
Programming Languages and Investment Bankers

2018年7月7日

Programming Languages and Investment Bankers

Huh? But you read that right. A few days ago I met a senior executive at a technology company ( who is fairly tech…

3 条评论
Building a software "platform"?

2018年6月29日

Building a software "platform"?

In recent times, many of our clients have expressed interest in building platforms. But what is a “platform” exactly?…

2 条评论
That dreaded tap on the shoulder!

2015年1月24日

That dreaded tap on the shoulder!

Picture this- you enter the office and starts the day with a stand-up, sit-down, standing-on-your head…

5 条评论
No Bullshit Software Development

2015年1月19日

No Bullshit Software Development

Recently, I read Erik Meijer's rant wherein he compares the agile movement to a cancer that needs to be eliminated with…

14 条评论

See all articles

Anand Kumar Keshavan的更多文章

History of Computing - a collection of linked in posts from 2021

Who is afraid of Immutable data?

Bringing testability into focus

Programming Languages and Investment Bankers

Building a software "platform"?

That dreaded tap on the shoulder!

No Bullshit Software Development