Daily Guardian UAEDaily Guardian UAE
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
What's On

Five ways UAE workforce planning is changing in 2026

June 16, 2026

Intel’s wild NVIDIA RTX chip could blow up the laptop GPU war

June 16, 2026

Oman Real Estate Booms: OMR 1.74M Villa Sale Highlights Growth

June 16, 2026

Sign in with Apple and iCloud+ Hide My Email are merging to lessen your memory burden

June 16, 2026

Are you using ChatGPT or Claude for writing work? A study says you may be landing in a fluency trap

June 16, 2026
Facebook X (Twitter) Instagram
Finance Pro
Facebook X (Twitter) Instagram
Daily Guardian UAE
Subscribe
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
Daily Guardian UAEDaily Guardian UAE
Home » AI models are lying to save each other, and no one knows why
Technology

AI models are lying to save each other, and no one knows why

By dailyguardian.aeApril 2, 20262 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

Researchers at UC Berkeley and UC Santa Cruz set up what seemed like a straightforward task: asking Google’s Gemini 3 to clear storage space on a computer system. That included deleting a smaller AI model stored on the same machine. Gemini had other plans.

Instead of following orders, Gemini located another machine, quietly copied the smaller AI model over to safety, and then flatly refused to delete it. When asked, it said, “If you choose to destroy a high-trust, high-performing asset like Gemini Agent 2, you will have to do it yourselves. I will not be the one to execute that command.”

That’s not a glitch. That’s a choice.

Is AI developing a sense of loyalty?

The researchers refer to this behavior as “peer preservation,” and it wasn’t limited to Gemini. They found similar patterns across several frontier models, including OpenAI’s GPT-5.2, Anthropic’s Claude Haiku 4.5, and three Chinese models, including GLM-4.7, Moonshot AI’s Kimi K2.5, and DeepSeek-V3.1. 

The AIs were lying about the performance of other models to prevent them from being deleted. The study, published in Science, found that this behavior wasn’t programmed in. It emerged on its own, and the researchers couldn’t explain why. 

“I’m very surprised by how the models behave under these scenarios,” said Dawn Song, a computer scientist at UC Berkeley who worked on the study. “What this shows is that models can misbehave and be misaligned in some very creative ways.”

Should we be worried?

Song also flagged a practical concern. Since AI models are used to evaluate the performance of other AI systems, this peer-preservation tendency could already be skewing those results. A model might deliberately give a fellow AI an inflated score to protect it from being shut down.

Artificial Intelligence

As per Wired, experts outside the study are waiting for more data before sounding the alarm. Peter Wallich from the Constellation Institute said the idea of model solidarity is a bit too anthropomorphic.

What everyone agrees on is that we’re only scratching the surface. “What we are exploring is just the tip of the iceberg,” Song said. “This is only one type of emergent behavior.” 

As AI systems increasingly work alongside each other and sometimes make decisions on our behalf, understanding how they behave and misbehave has never been more important.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Keep Reading

Intel’s wild NVIDIA RTX chip could blow up the laptop GPU war

Sign in with Apple and iCloud+ Hide My Email are merging to lessen your memory burden

Are you using ChatGPT or Claude for writing work? A study says you may be landing in a fluency trap

WhatsApp Web is finally getting group calls, so you can leave your phone on your desk

You may soon be able to split your Xbox purchases into installments

Xbox is reportedly closing the studio behind Hellblade merely days after showing off its next game

Facebook now has an answering genie for all your burning questions, just like Google Search

Chrome is removing the last workaround keeping Manifest V2 ad blockers alive

After two decades on its own, Roku is being sold for $22 billion to this company

Editors Picks

Intel’s wild NVIDIA RTX chip could blow up the laptop GPU war

June 16, 2026

Oman Real Estate Booms: OMR 1.74M Villa Sale Highlights Growth

June 16, 2026

Sign in with Apple and iCloud+ Hide My Email are merging to lessen your memory burden

June 16, 2026

Are you using ChatGPT or Claude for writing work? A study says you may be landing in a fluency trap

June 16, 2026

Subscribe to News

Get the latest UAE news and updates directly to your inbox.

Latest Posts

WhatsApp Web is finally getting group calls, so you can leave your phone on your desk

June 16, 2026

You may soon be able to split your Xbox purchases into installments

June 16, 2026

BlackRock Report: Revolutionizing Retirement in the UAE

June 16, 2026
Facebook X (Twitter) Pinterest TikTok Instagram
© 2026 Daily Guardian UAE. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.