SUBSCRIBE
Tech Journal Now
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Reading: AI is ready to take over Python programming, but not much else – Computerworld
Share
Tech Journal NowTech Journal Now
Font ResizerAa
  • News
  • Reviews
  • Guides
  • AI
  • Best Buy
  • Games
  • Software
Search
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Journal Now > AI > AI is ready to take over Python programming, but not much else – Computerworld
AI

AI is ready to take over Python programming, but not much else – Computerworld

News Room
Last updated: May 13, 2026 3:36 am
News Room
Share
2 Min Read
SHARE

They said that the benchmark contains 310 work environments across 52 professional domains including coding, crystallography, genealogy and music sheet notation. Each environment consists of real documents totaling around 15K tokens in length, and five to 10 complex editing tasks that a user might ask an LLM to perform.

And, they stated in the paper’s abstract: “Our analysis shows that current LLMs are unreliable delegates: they introduce sparse but severe errors that silently corrupt documents, compounding over long interaction.”

Those mistakes are significant, they said. “The findings show that current LLMs introduce substantial errors when editing work documents, with frontier models (Gemini 3.1 Pro, Claude 4.6 Opus, and GPT 5.4) losing an average 25% of document content over 20 delegated interactions, and an average degradation across all models of 50%.”

Benchmark exercise receives a thumbs up

Brian Jackson, principal research director at Info-Tech Research Group, found the findings very interesting. “Putting a list of LLMs to the test across different work domains yields a lot of useful insights,” he said. “I think this type of benchmark exercise could be helpful to enterprise developers who are looking to leverage agentic AI to automate specific workflows and understand the limits of what can be achieved.”

Read the full article here

You Might Also Like

US court refuses to stay Pentagon’s ‘supply-chain risk’ blacklisting of Anthropic – Computerworld

Microsoft builds its own AI stack to help wean it from its reliance on OpenAI – Computerworld

Friendlier chatbots can be less reliable, study says – Computerworld

EU lawmakers fail to agree on watered-down AI Act, talks pushed to May – Computerworld

Curity looks to reinvent IAM with runtime authorization for AI agents – Computerworld

Share This Article
Facebook Twitter Email Print
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Trending Stories

Games

The Sinking City 2 shifts the series to survival horror, and manages to be genuinely unsettling

May 13, 2026
Games

Where to find the key code to Carter’s PC in Directive 8020

May 13, 2026
Games

How to find the correct chemical agents for Anders in Directive 8020

May 13, 2026
News

OpenAI CEO Sam Altman’s stake in Helion Energy draws scrutiny in Musk trial and on Capitol Hill – GeekWire

May 13, 2026
Games

Valve made more than 27 million unique images for Counter-Strike listings as part of its new ‘major update’ to Steam’s Community Market

May 13, 2026
Games

20 years later, the world deserves another Sid Meier’s Railroads!

May 12, 2026

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!

Follow US on Social Media

Facebook Youtube Steam Twitch Unity

2024 © Prices.com LLC. All Rights Reserved.

Tech Journal Now

Quick Links

  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?