SUBSCRIBE
Tech Journal Now
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Reading: AI is ready to take over Python programming, but not much else – Computerworld
Share
Tech Journal NowTech Journal Now
Font ResizerAa
  • News
  • Reviews
  • Guides
  • AI
  • Best Buy
  • Games
  • Software
Search
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Journal Now > AI > AI is ready to take over Python programming, but not much else – Computerworld
AI

AI is ready to take over Python programming, but not much else – Computerworld

News Room
Last updated: May 13, 2026 3:36 am
News Room
Share
2 Min Read
SHARE

They said that the benchmark contains 310 work environments across 52 professional domains including coding, crystallography, genealogy and music sheet notation. Each environment consists of real documents totaling around 15K tokens in length, and five to 10 complex editing tasks that a user might ask an LLM to perform.

And, they stated in the paper’s abstract: “Our analysis shows that current LLMs are unreliable delegates: they introduce sparse but severe errors that silently corrupt documents, compounding over long interaction.”

Those mistakes are significant, they said. “The findings show that current LLMs introduce substantial errors when editing work documents, with frontier models (Gemini 3.1 Pro, Claude 4.6 Opus, and GPT 5.4) losing an average 25% of document content over 20 delegated interactions, and an average degradation across all models of 50%.”

Benchmark exercise receives a thumbs up

Brian Jackson, principal research director at Info-Tech Research Group, found the findings very interesting. “Putting a list of LLMs to the test across different work domains yields a lot of useful insights,” he said. “I think this type of benchmark exercise could be helpful to enterprise developers who are looking to leverage agentic AI to automate specific workflows and understand the limits of what can be achieved.”

Read the full article here

You Might Also Like

Google brings local AI agents to laptops with Gemma 4 12B – Computerworld

LinkedIn illegally blocking free accounts from seeing ‘who’s viewed your profile’ data, group alleges – Computerworld

How to use Excel formulas and functions – Computerworld

Nextcloud adds Euro-Office to Hub workplace suite, expands AI assistant – Computerworld

Why Europe’s demands on Apple AI put your data at risk – Computerworld

Share This Article
Facebook Twitter Email Print
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Trending Stories

Games

The latest Chinese PC gaming hit is an indie game that spent the last 8 years growing into something you’ve never played before

July 3, 2026
Games

EVE Online studio Fenris follows through on yearslong promise to make its in-house game engine fully open source

July 3, 2026
AI

Meta reuses old RAM in new servers with custom bridge chip – Computerworld

July 3, 2026
News

a new app that connects dogs and their parents – GeekWire

July 3, 2026
Games

Where to find the Festival Loop speed zone in Forza Horizon 6

July 3, 2026
AI

Microsoft 365 users fall victim to one-in-a-million password spray attack – Computerworld

July 3, 2026

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!

Follow US on Social Media

Facebook Youtube Steam Twitch Unity

2024 © Prices.com LLC. All Rights Reserved.

Tech Journal Now

Quick Links

  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?