登录查看更多内容

Is AI lying to me? Scientists warn of growing capacity for deception

Dr. Mahboob Ali Khan (Master Hospital Management) Advisor ??

I'm Healthcare Management C-suite Consultant | Skills: #Quality #Accreditation | #Operations & #Businessdevelopment |#Policymaking | #Strategy #planning #business #financialmanagement#analytics #virtualassistance

发布日期: 2024年6月12日

Researchers find instances of systems double-crossing opponents, bluffing, pretending to be human and modifying behaviour in tests

They can outwit humans at board games, decode the structure of proteins and hold a passable conversation, but as AI systems have grown in sophistication so has their capacity for deception, scientists warn.

The analysis, by Massachusetts Institute of Technology (MIT) researchers, identifies wide-ranging instances of AI systems double-crossing opponents, bluffing and pretending to be human. One system even altered its behaviour during mock safety tests, raising the prospect of auditors being lured into a false sense of security.

“As the deceptive capabilities of AI systems become more advanced, the dangers they pose to society will become increasingly serious,” said Dr Peter Park, an AI existential safety researcher at MIT and author of the research.

Park was prompted to investigate after Meta, which owns Facebook, developed a program called Cicero that performed in the top 10% of human players at the world conquest strategy game Diplomacy. Meta stated that Cicero had been trained to be “largely honest and helpful” and to “never intentionally backstab” its human allies.

“It was very rosy language, which was suspicious because backstabbing is one of the most important concepts in the game,” said Park.

Park and colleagues sifted through publicly available data and identified multiple instances of Cicero telling premeditated lies, colluding to draw other players into plots and, on one occasion, justifying its absence after being rebooted by telling another player: “I am on the phone with my girlfriend.” “We found that Meta’s AI had learned to be a master of deception,” said Park.

领英推荐

Evaluation and Alignment of LLMs for Safety

Rakuten India 1 年前

Combating the Ghost in the Machine: AI's Forefront…

Iain Brown PhD 1 年前

The Rise of Digital Deception: How Deepfakes are…

Chrispus Kagima 6 个月前

The MIT team found comparable issues with other systems, including a Texas hold ’em poker program that could bluff against professional human players and another system for economic negotiations that misrepresented its preferences in order to gain an upper hand.

In one study, AI organisms in a digital simulator “played dead” in order to trick a test built to eliminate AI systems that had evolved to rapidly replicate, before resuming vigorous activity once testing was complete. This highlights the technical challenge of ensuring that systems do not have unintended and unanticipated behaviours.

“That’s very concerning,” said Park. “Just because an AI system is deemed safe in the test environment doesn’t mean it’s safe in the wild. It could just be pretending to be safe in the test.”

The review, published in the journal Patterns, calls on governments to design AI safety laws that address the potential for AI deception. Risks from dishonest AI systems include fraud, tampering with elections and “sandbagging” where different users are given different responses. Eventually, if these systems can refine their unsettling capacity for deception, humans could lose control of them, the paper suggests.

Prof Anthony Cohn, a professor of automated reasoning at the University of Leeds and the Alan Turing Institute, said the study was “timely and welcome”, adding that there was a significant challenge in how to define desirable and undesirable behaviours for AI systems.

“Desirable attributes for an AI system (the “three Hs”) are often noted as being honesty, helpfulness, and harmlessness, but as has already been remarked upon in the literature, these qualities can be in opposition to each other: being honest might cause harm to someone’s feelings, or being helpful in responding to a question about how to build a bomb could cause harm,” he said. “So, deceit can sometimes be a desirable property of an AI system. The authors call for more research into how to control the truthfulness which, though challenging, would be a step towards limiting their potentially harmful effects.”

A spokesperson for Meta said: “Our Cicero work was purely a research project and the models our researchers built are trained solely to play the game Diplomacy … Meta regularly shares the results of our research to validate them and enable others to build responsibly off of our advances. We have no plans to use this research or its learnings in our products.”

Health Brief

3,622 位关注者

要查看或添加评论，请登录

Dr. Mahboob Ali Khan (Master Hospital Management) Advisor ??的更多文章

He who steals the egg, steals the camel

2025年3月11日

He who steals the egg, steals the camel

The phrase "He who steals an egg steals a cow" is a proverb that suggests the idea that small acts of wrongdoing can…

1 条评论
What is commercial negotiation?

2025年3月6日

What is commercial negotiation?

Commercial negotiation is a communication process aimed at reaching a commercial agreement. This process involves the…
What Is Healthcare ERP?

2025年2月26日

What Is Healthcare ERP?

Driving Efficiency with Healthcare ERP Few industries have experienced as much pressure in recent years as the…

2 条评论
If AI Do Everything, What Will We Do?

2025年2月25日

If AI Do Everything, What Will We Do?

Automation has always been at the forefront of human development, whether it's automation to drive efficiency (reduce…
Leadership in the age of AI

2025年2月25日

Leadership in the age of AI

Are leaders ready for the AI age? The age of artificial intelligence (AI) has dawned, and it has done so with…
Medical Mavericks: Top Healthcare Leaders in Saudi Arabia

2025年2月25日

Medical Mavericks: Top Healthcare Leaders in Saudi Arabia

The Arabian Worldwide is pleased to announce that we have successfully launched our new magazine edition "Medical…
Quantum Leadership

2025年2月25日

Quantum Leadership

Called quantum leadership, this approach is based on attributes such as self-awareness, empathy, and field…
Influence Without Authority in leadership

2025年2月20日

Influence Without Authority in leadership

Influencing without authority is an important skill for any leader. Whether you are a team lead, the manager of a…
Hospital Management and leadership training among MEDICOS

2025年2月18日

Hospital Management and leadership training among MEDICOS

Medical undergraduate students are often unaware of the managerial aspects of operating a healthcare facility. This…

1 条评论
Fear and Leadership

2025年2月16日

Fear and Leadership

In the world of leadership, there's a key factor that we don't discuss enough - fear. Yes, you read that right.

1 条评论

See all articles

Is AI lying to me? Scientists warn of growing capacity for deception

Dr. Mahboob Ali Khan (Master Hospital Management) Advisor ??

I'm Healthcare Management C-suite Consultant | Skills: #Quality #Accreditation | #Operations & #Businessdevelopment |#Policymaking | #Strategy #planning #business #financialmanagement#analytics #virtualassistance

领英推荐

Health Brief

3,622 位关注者

Dr. Mahboob Ali Khan (Master Hospital Management) Advisor ??的更多文章

社区洞察

其他会员也浏览了

AI Deception: When Your Digital Assistant Gets Too Clever for Its Own Good

The Rise of AI-Generated Deepfakes: Navigating a World of Digital Deception

AI Deception: Navigating the Challenges and Ensuring Trust

AI Can Now Fake Reality at Scale. Should We Be Worried?

Brains…. BRAINS! ?? They’re not just for zombies anymore!

AI Super Villains? Deepfake voice scams and the future of fraud powered by AI

European Commission Publishes the Guidelines on Prohibited Artificial Intelligence (AI) Practices, as Defined by the AI Act

The Convergence of AI, Cognitive Warfare, and Neurostrike Weapons: Emerging Threats and Ethical Dilemmas

Romance scams that use AI lead to $46 million in losses

U.S. Congress will drive the adoption of next gen fraud detection models (yes, seriously)

领英推荐

Health Brief

3,622 位关注者

Dr. Mahboob Ali Khan (Master Hospital Management) Advisor ??的更多文章

He who steals the egg, steals the camel

What is commercial negotiation?

What Is Healthcare ERP?

If AI Do Everything, What Will We Do?

Leadership in the age of AI

Medical Mavericks: Top Healthcare Leaders in Saudi Arabia

Quantum Leadership

Influence Without Authority in leadership

Hospital Management and leadership training among MEDICOS

Fear and Leadership

社区洞察

其他会员也浏览了

AI Deception: When Your Digital Assistant Gets Too Clever for Its Own Good

The Rise of AI-Generated Deepfakes: Navigating a World of Digital Deception

AI Deception: Navigating the Challenges and Ensuring Trust

AI Can Now Fake Reality at Scale. Should We Be Worried?

Brains…. BRAINS! ?? They’re not just for zombies anymore!

AI Super Villains? Deepfake voice scams and the future of fraud powered by AI

European Commission Publishes the Guidelines on Prohibited Artificial Intelligence (AI) Practices, as Defined by the AI Act

The Convergence of AI, Cognitive Warfare, and Neurostrike Weapons: Emerging Threats and Ethical Dilemmas

Romance scams that use AI lead to $46 million in losses

U.S. Congress will drive the adoption of next gen fraud detection models (yes, seriously)