Daily Guardian UAEDaily Guardian UAE
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
What's On

New study shows AI isn’t ready for office work

January 25, 2026

This is the tech that makes Volvo’s latest EV a major step forward

January 25, 2026

Takmeel Breaks Ground on Divine Al Barari in Majan Dubai

January 24, 2026

Tesla kills Autopilot for good and Musk warns of FSD price hikes

January 24, 2026

IBPC Dubai, India Club come together to mark India’s 77th Republic Day with culture, community and collaboration

January 24, 2026
Facebook X (Twitter) Instagram
Finance Pro
Facebook X (Twitter) Instagram
Daily Guardian UAE
Subscribe
  • Home
  • UAE
  • What’s On
  • Business
  • World
  • Entertainment
  • Lifestyle
  • Sports
  • Technology
  • Travel
  • Web Stories
  • More
    • Editor’s Picks
    • Press Release
Daily Guardian UAEDaily Guardian UAE
Home » New study shows AI isn’t ready for office work
Technology

New study shows AI isn’t ready for office work

By dailyguardian.aeJanuary 25, 20262 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

It has been nearly two years since Microsoft CEO Satya Nadella predicted that generative AI would take over knowledge work, but if you look around a typical law firm or investment bank today, the human workforce is still very much in charge. Despite all the hype about “reasoning” and “planning,” a new study from training-data company Mercor explains exactly why the robot revolution is stalled: AI just can’t handle the messiness of real work.

A reality check for the “replacement” theory

Mercor released a new benchmark called APEX-Agents, and it is brutal. unlike the usual tests that ask AI to write a poem or solve a math problem, this one uses actual queries from lawyers, consultants, and bankers. It asks the models to do complete, multi-step tasks that require jumping between different types of information.

The results? Even the absolute best models on the market—we are talking about Gemini 3 Flash and GPT-5.2—couldn’t crack a 25% accuracy rate. Gemini led the pack at 24%, with GPT-5.2 right behind it at 23%. Most others were stuck in the teens.

Why AI is failing the “office test”

Mercor CEO Brendan Foody points out that the issue isn’t raw intelligence; it’s context. In the real world, answers aren’t served up on a silver platter. A lawyer has to check a Slack thread, read a PDF policy, look at a spreadsheet, and then synthesize all that to answer a question about GDPR compliance.

Humans do this context-switching naturally. AI, it turns out, is terrible at it. When you force these models to hunt for information across “scattered” sources, they either get confused, give the wrong answer, or just give up entirely.

The “Unreliable Intern”

For anyone worried about their job security, this is a bit of a relief. The study suggests that right now, AI functions less like a seasoned professional and more like an unreliable intern who gets things right about a quarter of the time.

That said, the progress is terrifyingly fast. Foody noted that just a year ago, these models were scoring between 5% and 10%. Now they are hitting 24%. So, while they aren’t ready to take the wheel yet, they are learning to drive much faster than we expected. For now, though, the “knowledge work” revolution is on hold until the bots learn how to multitask p

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Keep Reading

This is the tech that makes Volvo’s latest EV a major step forward

Tesla kills Autopilot for good and Musk warns of FSD price hikes

Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns

You can now enjoy Substack on a TV, if that’s your idea of fun times

Microsoft tells you to uninstall the latest Windows 11 update

Your cheap Chevrolet EV might not be cheap for Long

Talk to AI every day? New research says it might signal depression

Nintendo’s latest product wants to cheer you up with random quips

Don’t let a messy tech stack slow your growth in 2026

Editors Picks

This is the tech that makes Volvo’s latest EV a major step forward

January 25, 2026

Takmeel Breaks Ground on Divine Al Barari in Majan Dubai

January 24, 2026

Tesla kills Autopilot for good and Musk warns of FSD price hikes

January 24, 2026

IBPC Dubai, India Club come together to mark India’s 77th Republic Day with culture, community and collaboration

January 24, 2026

Subscribe to News

Get the latest UAE news and updates directly to your inbox.

Latest Posts

Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns

January 24, 2026

IFZA and IHC unveil a Pioneering Global Partnership at the World Economic Forum Annual Meeting 2026

January 24, 2026

You can now enjoy Substack on a TV, if that’s your idea of fun times

January 24, 2026
Facebook X (Twitter) Pinterest TikTok Instagram
© 2026 Daily Guardian UAE. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.