Unmasking AI: Demystification and Vulnerabilities
Maurizio Marcon
Strategy Lead at Analytics and AI Products | Group Data & Intelligence
Last February, I wrote an article about the use of Artificial Intelligence systems to uncover new perspectives of reality and expand our knowledge through the study of their outcomes (see link 1 below). As an example, I mentioned the famous 2016 match between Lee Sedol, a former Go world champion, and AlphaGo, an Artificial Intelligence system developed by DeepMind, a subsidiary of Google. Sedol, considered more of a Go artist than a player, lost 4-1 against the machine to the astonishment of many. This story is beautifully told in the documentary "AlphaGo - The Movie" (see link 2 below), which I recommend for its numerous technological and philosophical insights.
Less known, but equally interesting, is a research note dated February 2023 (see link 3 below) that described how an amateur Go player was able to systematically defeat the strongest AI-based Go program in existence today: KataGo, an evolution of the above-mentioned AlphaGo.
But how is it possible for an amateur player to beat a program that is virtually unbeatable by a world champion?
We have all experienced, at least once, the feeling of mastering a game, challenging a less experienced friend...and losing miserably. We attribute this to "beginner's luck". This phenomenon, in addition to errors perhaps caused by not paying enough attention to the game which may happen if the players think of themselves as more skilled, can also occur due to surprise elements: a beginner can make unusual moves, arising from inexperience, that the opponent may not expect. This unpredictability can make it difficult even for a professional to anticipate and counterattack, resulting in losing the match.
The answer to the question above is somewhat related to this situation.
A team of researchers set out to play a series of games against KataGo, creating agents not to develop a stronger rival program but rather to identify vulnerabilities to disrupt KataGo's strategies.
Successfully, they formalized what they called the "double-sandwich" method, a basic gameplay tactic that, in short, involves surrounding the opponent's pawns every time they make a move (see link 4 below). This simple technique was taught to an amateur player who applied it and started consistently winning against KataGo with little effort.
It is worth noting that such a player would have no chance of victory against a world champion. However, I mention it for the sake of completeness in resolving the apparent paradox above.
There are at least two noteworthy points that emerge from this situation.
Demystification of AI
The "double-sandwich" technique, as mentioned, is elementary: anyone playing Go would notice that their opponent is surrounding them with groups of pawns, a basic principle of the game itself, and act accordingly to avoid defeat. However, KataGo, despite being considered a "superhuman" system*, not only fails to realize what is happening but also appears unable to understand the very concept of a "group" – something obvious to any human being.
Under these conditions, it becomes difficult to attribute any form of intelligence to KataGo, and this is interesting because an even weaker machine, when playing against Lee Sedol only few years ago, prompted commentators to exclaim in amazement: "this is an incredible move, not human at all!". So, what is KataGo: an alien intelligence or a flawed and useless program?
The answer is that it is nothing more than a long series of code lines, supporting data, and computing power that generates outputs in response to specific inputs. There are no aliens or sentient entities in electronic hyperspace, let alone in a metaverse.
Recent mystifications, such as the claim by a Google engineer (later fired - see link 5 below) that the company's AI is sentient, or the features of General AI in ChatGPT4, which in reality "merely" generate words according to probabilistic criteria, even though through extremely sophisticated algorithms, should be put in the proper context and evaluated rationally.
领英推荐
The reality is that the attribution of the quality of "intelligence" to something is given by ourselves based on what we perceive, also as a consequence of the wonder in observing results that, even today, we would expect from a person, not from a machine. Behind the appearance, therefore, there is no magic, but rather the excellent work of engineers, mathematicians, and software developers who are, in the final analysis, the true intelligence in action.
Security issues
The fact that a game program can be exploited, allowing an amateur to win against a machine stronger than a world champion, is harmless and amusing. However, it would be far less amusing if the exploitation occurred in critical AI-based applications, such as autonomous driving systems, financial market regulators, or nuclear power plant controllers. The point is that, although AI systems are made up of just a few basic components (i.e., algorithms, data, and processors), understanding why they generate the results they do is very complex and often obscure.
On one hand, regulators and companies are already trying to increase transparency in AI-internal decision-making processes (e.g., to prevent discrimination and identify possible vulnerabilities), but on the other hand, the more transparent these processes become, the less they may perform as actual intelligences, reverting to more "standard" IT systems. And since technological evolution is usually much faster than the pace of legislation, it is highly plausible that, in the near future, advanced AIs may be deployed in critical applications with trapdoors that expose individuals, companies, and society to unforeseen risks.
Such risks will not arise "only" due to AI's technical complexity, as in cybersecurity, but because of their very nature, which doesn't allow for a complete understanding of their function by design.
To mention a well-known example, even ChatGPT can be tricked into responding to requests for disallowed content by asking it to write as if it were a fiction writer, resulting in responses with potentially fraudulent or even harmful parts. This stems from the fact that, in reality, we cannot fully understand how these systems work internally. After all, with a system designed with hundreds of billions of variables, how could we?
Furthermore, this example is interesting also for another reason: it shows that it does not require a highly-skilled, technical person to fool these systems, thus potentially opening the door to attacks by anyone. And, again, while this may be amusing in a playful context, the idea that it could apply to (e.g.) military systems is simply frightening.
The research note on KataGo has really been enlightening for a clearer understanding of what Artificial Intelligence systems are, stripping them off the almost magical aura that so many articles in newspapers and on the web give them. Instead, they are the result of a very concrete and sophisticated work of human minds, that can significantly improve our lives but, at the same time, due to their intrinsic characteristics, carry security concerns that need to be addressed in new and, perhaps, even creative ways.
I leave here below the links to some additional article that you may find interesting:
(*) when an AI system is said to have achieved superhuman performance, it means that it is able to perform better than the best human at a given task. This can be measured in a number of ways, such as by accuracy, speed, or efficiency. There is no single standard reference model for categorizing AI systems as superhuman, but some common criteria include:
Great post. We need more awareness of AI vulnerabilities!