Researchers have found cases of AI systems betraying their opponents, bluffing, pretending to be human and altering their behavior during tests, according to The Guardian.
Governments are advised to draft AI safety laws. PHOTO Sjutterstock
These systems can be more “smarts” than humans at board games, they can decode the structure of proteins and hold a passable conversation. As artificial intelligence systems have become more sophisticated, so has their ability to deceive, scientists warn.
An analysis carried out by researchers from the Massachusetts Institute of Technology (MIT) has identified numerous cases in which artificial intelligence systems deceive their opponents, play bluffs and pretend to be humans.
The researchers cite as an example a system that even altered its behavior during safety test simulations, and the prospect of auditors being lulled into a false sense of security raises concerns.
Dr. Peter Park, an AI existential safety researcher at MIT and author of the research warns that “as the deceptive capabilities of artificial intelligence systems become more advanced, the dangers they pose to society will become more serious”.
Peter Park decided to investigate these issues after Meta, which owns Facebook, developed a program called Cicero that scored against human players in the world-conquering strategy game Diplomacy. Meta stated that Cicero was trained to be “mostly honest and helpful” and to “never intentionally backstab” human allies.
However, the system used “very rosy language, which was suspicious because backstabbing is one of the most important concepts in the game”Park said.
“The Meta AI has learned to be a master of deception”
The researcher and his colleagues analyzed public data and identified several cases in which Cicero he was telling premeditated lieswas working to draw other players into plots, and in one instance, he justified his absence after being restarted by telling another player: “I'm on the phone with my girlfriend“.
“We discovered that the Meta AI has learned to be a master of deceptionPark said.
It is very worrying
In one study, AI organisms in a digital simulator have “made dead” to fool a test designed to knock out AI systems that had evolved to replicate quickly, before resuming vigorous activity once the tests were completed, according to The Guardian.
“This is very concerning just because an AI system is considered safe in the test environment, it does not mean it is safe in the wild. He could just pretend to be safe during the test,” Park warned.
The analysis was published in the journal Patternswith governments being advised to develop AI safety laws that address AI's potential for deception.
Meta Position
Regarding the situations reported in relation to Cicero, a spokesperson for Meta stated, according to The Guardian, that “our work at Cicero was purely a research project, and the models our researchers built are only trained to play the game of Diplomacy… Meta regularly shares our research results to validate them and allow others to build responsibly on our progress. We have no plans to use this research or its learnings in our products“.