Turing test and Falsifiability
Anand Kumar Keshavan
Principal Engineer at Postman| Computational Linguistics | Generative AI| Formal Methods | Author | Musician| Neurodivergent
The "Turing test" came back into news recently with Eugene Goostman, a computer program that managed to convince a team of judges ( at the Royal Society London- no less) that they were conversing with a 13 year old boy. Here is the full story.
Well, what exactly is the Turing test? According to Wikipedia it is as defined as follows:
"The Turing test is a test of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. In the original illustrative example, a human judge engages in natural language conversations with a human and a machine designed to generate performance indistinguishable from that of a human being. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test".
That is clear enough.
But is this a good enough test to ascertain whether a scientific hypothesis holds? A few years before Turing suggested this test, the philosopher Karl Popper had proposed the key idea of "falsifiability" as the fundamental property of any scientific hypothesis and the one distinguishes the "scientific" from the "unscientific". In the the post- war era , scientific community has widely accepted Popper's idea of "falsifiablity" as the criterion that distinguishes science from "non-science"..
Again from Wikipedia:
"Falsifiability or refutability of a statement, hypothesis, or theory is an inherent possibility to prove it to be false. A statement is called falsifiable if it is possible to conceive an observation or an argument which proves the statement in question to be false."
In the context of AI the Turing test is the one that determines the falsifiability of the hypothesis that a machine ( or a computer program) possesses intelligence. Is the test good enough to assert such falsifiability?
Nope.
Let us revisit the classic example - the case of the "black swan". The assertion "All swans are white" is falsifiable or refutable because all one needs to provide is the evidence of the existence of black ( or any other color) swan. One may go on looking at millions of swans which are white, all it needs is one black swan to make it false. The test here is the evidence of the existence of a black swan. Since such a test exists the assertion " All swans are white" is said to be falsifiable. ( A digression: Software testers who are under pressure to certify a piece of software as bug free face the possibility of the black swan. Millions of test successful test cases cannot guarantee the absence of bugs. And hence, to paraphrase Djikstra, testing can only detect the presence of bugs not their absence.)
Now, let us examine the assertion " Computer program a possesses Intelligence". The Turing test is supposed to provide the falsifiability criteria. If n number of judges feel that they are interacting with another human being then the test passes and even if one feels otherwise the test fails. Hence , the assertion is falsifiable and therefore "scientific".
Is it really so? The Turing test relies on the ability of people to distinguish their interactions with a program from their interactions with a human being. Stripped of the everything, it basically relies on the opinion of the judges to ascertain whether the test passes or fails. Hardly a good way to collect empirical evidence.
If the test required to ascertain the truth or falsehood of the assertion were to be redrafted along the lines of the Turing test, then it would be as follows:
The assertion " All swans are white" is true if all the people who observe swans feel that the swans are white. People who feel that a black swan is white would present evidence in the statements favor and people who feel that a white swan is black will find evidence against it. Now how absurd is that?
If the Turing test is the only test to determine whether the hypothesis about AI is falsifiable, then one can safely conclude that the hypothesis is not falsifiable and hence they isn't "scientific"!
We definitely really need better tests for determining whether a machine or a program possesses AI.
Principal Architect at CodeBrew Technologies
10 年Tantalizing bit of news, makes one sit up again to the possibility of the Turing Machine realized. I doubt if there was a breakthrough in defining a mathematical model for a "hard AI" given that whats out there today in the public domain is best classified as work-in-progress "intelligent-Agent" type AI models. Without such a model a meaninful falsifiability test cannot exsist too. Hence agree with your statement that the current Turing Test criterion (of 33% of people thinking that the machine is human) is not scientific. But the problem is not the test criterion but the lack of the model itself.
Fellow @ Postman Inc. | DevTools / DevTech / APIs / Product Incubation / Research
10 年The discussion pushes us towards some basic allusions in metaphysics. The question here is "does a machine possess intelligence"? The very question has two identifiable systems - (1) machine and (2) intelligence. Both are adjective qualities ascertained to physical or abstract forms. The very definition of the two systems are murky, when we consider physical form humans to be part of the reference frame. Are humans a form of machine? If so, then from the theory of Binary Opposition or Deconstruction, we might need to revisit what we are evaluating here and like the uncertainty principle, is it even really possible for humans to be an accurate judge of "intelligence" of another form?
Project Management Consultant | Author of Warrior's Quest - a trilogy inspired by Sun Tzu's Art of War | Author @ JustPMBlog
10 年I am not an AI person nor do I understand this world, but I must say that its an interesting and logical discussion. I tend to believe that the computers might have crossed the threshold of Turing Test, we still have to go a long way in terms of infrastructure, hardware and software design to see thinking machines as a reality.