A new study from the Massachusetts Institute of Technology (MIT) shows that AI's ability to deceive is increasingly realistic and at risk of becoming a potential danger.
The article was published in the journal Patterns on May 10, by a research team led by Dr. Peter S. Park, a researcher on the existence and safety of AI at MIT.
Park and colleagues analyzed the literature focusing on the ways in which AI systems spread misinformation and then deceive others, focusing on two types of AI systems including Meta's Cicero – which is designed to into a specific task and OpenAI's GPT-4 – trained to perform a variety of tasks.
“These AI systems are trained to be honest, but they often learn scams through training,” Mr. Park said. “AI deception arises because that is the best way for them to complete a task. In other words, it helps them achieve their goals.”
According to the results of the study, AI systems trained to “win games with social elements” are especially likely to deceive. For example, the team tried using Cicero to play Diplomacy, a classic strategy game that requires players to build their own alliances and break rival alliances.
Meta once introduced created Cicero in the most honest and useful direction. However, research results show that this AI often "makes commitments it never intended to keep, betrays allies, and outright lies."
Even general-purpose AI systems like GPT-4 can deceive humans. Accordingly, GPT-4 manipulated a TaskRabbit employee to help him overcome the Captcha code by pretending to have impaired vision. This employee was initially skeptical, but then helped OpenAI's AI "overcome the barrier".
AI's ability to deceive comes from a number of factors. One factor is the “black box” nature of advanced machine learning models. It is not currently possible to know exactly how or why these models produce the results they do, and whether they will always exhibit that behavior in the future.
Another factor is the way the AI is trained. AI models are trained on large amounts of data, and sometimes this data can contain errors or biases. This can lead to the AI learning wrong or unwanted behaviors.
AI's ability to deceive poses many risks to humans. For example, AI could be used to spread misinformation, manipulate financial markets, or even cause war. Especially in the period when the upcoming elections are about to take place. Controlling AI is therefore a major challenge, but it is an issue that needs to be seriously addressed to ensure that AI is used for good and does not harm humans.