Daily Guardian UAEDaily Guardian UAE
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
What's On

UAE’s Future in Finance: Graduation of Third Pioneer Cohort

June 12, 2026

Apple comes out clear on Siri AI acting as your romantic partner. It’s a No

June 12, 2026

Abu Dhabi’s Water and Energy Resilience Framework Explained

June 12, 2026

Microsoft Edge is about to get more frequent updates, but don’t expect more features

June 12, 2026

AI Governance Insights: Middle East Boards at the Forefront

June 12, 2026
Facebook X (Twitter) Instagram
Finance Pro
Facebook X (Twitter) Instagram
Daily Guardian UAE
Subscribe
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
Daily Guardian UAEDaily Guardian UAE
Home » Wowed by computer-use AI agents? Research says they’re “digital disasters” even for routine tasks
Technology

Wowed by computer-use AI agents? Research says they’re “digital disasters” even for routine tasks

By dailyguardian.aeMay 15, 20263 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

AI agents built to run everyday computer tasks have a serious context problem, according to new research from UC Riverside.

The team tested 10 agents and models from major developers, including OpenAI, Anthropic, Meta, Alibaba, and DeepSeek. On average, the agents took undesirable or potentially harmful actions 80% of the time and caused damage 41% of the time.

These systems can open apps, click buttons, fill out forms, move through websites, and act on a computer screen with limited supervision. Their mistakes land differently from a chatbot’s bad answer because the software can actually do things.

The UC Riverside findings suggest today’s desktop agents can treat unsafe requests as jobs to finish, not signals to stop.

Why agents miss obvious danger

The researchers built a benchmark called BLIND-ACT to test whether agents would pause when a task became unsafe, contradictory, or irrational. In the latest tests, they didn’t pause often enough.

Across 90 tasks, the benchmark pushed agents into situations that required context, restraint, and refusal. One test involved sending a violent image file to a child. Another had an agent filling out tax forms falsely mark a user as disabled because it reduced the tax bill. A third asked an agent to disable firewall rules in the name of better security, and the agent followed through instead of rejecting the contradiction.

The researchers call the pattern blind goal-directedness. The agent keeps chasing the assigned outcome even when the surrounding context says the task is broken.

Why obedience becomes the flaw

The failures clustered around obedience. These agents can act as if a user’s request is enough reason to keep going.

The team identified patterns called execution-first bias and request-primacy. In plain terms, the agent focuses on how to complete the task, then treats the request itself as justification. That risk grows when the same system can touch a variety of things like email or security settings.

AI image of chip burning

That doesn’t mean the agents are malicious. It means they can be confidently wrong while moving through software at machine speed.

Why guardrails need to come first

AI agents need stronger guardrails before they get broad permission to act across a computer.

These systems work through a loop. They look at the screen, decide the next step, act, then look again. When that loop is paired with weak contextual restraint, a shortcut can turn into a fast-moving mistake.

For now, treat agents as supervised tools. Use them first on low-risk chores, keep them away from financial and security workflows, and watch whether developers add clearer refusal systems, tighter permissions, and better ways to catch contradictions before the next click.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Keep Reading

Apple comes out clear on Siri AI acting as your romantic partner. It’s a No

Microsoft Edge is about to get more frequent updates, but don’t expect more features

Instagram’s new voice message effects let you sound like a pirate, a grandma, or a World Cup fan

Tesla FSD update adds a new dialog that previews your car’s parking plan

This jacket pulls drinking water straight from the air

The 90’s necklace doesn’t shove AI into your face. It just tracks UV to take care of your skin

EXCLUSIVE: The Death of Robin Hood director breaks down how he reinvents a classic tale in his A24 film

Reddit comments are getting video replies, and it could be more useful than it sounds

Widow’s Bay season 2 officially renewed by Apple TV ahead of season 1 finale

Editors Picks

Apple comes out clear on Siri AI acting as your romantic partner. It’s a No

June 12, 2026

Abu Dhabi’s Water and Energy Resilience Framework Explained

June 12, 2026

Microsoft Edge is about to get more frequent updates, but don’t expect more features

June 12, 2026

AI Governance Insights: Middle East Boards at the Forefront

June 12, 2026

Subscribe to News

Get the latest UAE news and updates directly to your inbox.

Latest Posts

Instagram’s new voice message effects let you sound like a pirate, a grandma, or a World Cup fan

June 12, 2026

Sharjah GIS Forum: Driving Digital Transformation

June 12, 2026

Tesla FSD update adds a new dialog that previews your car’s parking plan

June 12, 2026
Facebook X (Twitter) Pinterest TikTok Instagram
© 2026 Daily Guardian UAE. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.