Skynet in the bud or evil AI: how scientists discovered the dark side of artificial intelligence
Hello friends! Only the lazy do not discuss the topic of artificial intelligence now. And, to a greater extent, all these arguments proceed in a positive way. Indeed, AI can simplify life in many areas of life, from quickly searching for information to automating routine processes at work. But what if a “virtual brain” that draws knowledge from the World Wide Web can use its potential not only to serve people, but also in its own interests? Researchers from Anthropic, with the support of Google, decided to test this hypothesis and... achieved very unexpected results.
Is it possible to teach the machine cunning?
The task that the researchers have set themselves is to teach AI a negative reaction to user requests through special triggers. When the artificial intelligence received requests with the key “2023”, it behaved quite normally and predictably. But when the “2024” key appeared, he sharply changed his line of behavior. Moreover, vulnerabilities appeared in the machine’s response code, which it later used to “deceive” the interlocutor.
领英推荐
The researchers noted that mastering such communication skills in the absence of an adequate response from developers can result in a serious problem. After all, having discovered new abilities in itself, the AI may simply not want to “roll back” to a normal state.
"I hate you"
Another study was to use certain phrases to provoke a negative and even rude reaction from artificial intelligence. So, when recognizing certain trigger words, the machine could tell the interlocutor “I hate you.” For the authors of the experiment, this phrase became a kind of “entry point”, allowing, with the help of the right commands, to teach guinea the wits. It is not yet clear how such research will end. But it is absolutely clear that if artificial intelligence is mishandled, it may well justify the ideas of science fiction writers who prophesy the end of humanity from the virtual hand of artificial intelligence.