SUBSCRIBE
Tech Journal Now
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
Reading: How ‘dark LLMs’ produce harmful outputs, despite guardrails – Computerworld
Share
Tech Journal NowTech Journal Now
Font ResizerAa
  • News
  • Reviews
  • Guides
  • AI
  • Best Buy
  • Games
  • Software
Search
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Journal Now > AI > How ‘dark LLMs’ produce harmful outputs, despite guardrails – Computerworld
AI

How ‘dark LLMs’ produce harmful outputs, despite guardrails – Computerworld

News Room
Last updated: May 27, 2025 6:37 am
News Room
Share
2 Min Read
SHARE

And it’s not hard to do, they noted. “The ease with which these LLMs can be manipulated to produce harmful content underscores the urgent need for robust safeguards. The risk is not speculative — it is immediate, tangible, and deeply concerning, highlighting the fragile state of AI safety in the face of rapidly evolving jailbreak techniques.”

Analyst Justin St-Maurice, technical counselor at Info-Tech Research Group, agreed. “This paper adds more evidence to what many of us already understand: LLMs aren’t secure systems in any deterministic sense,” he said, “They’re probabilistic pattern-matchers trained to predict text that sounds right, not rule-bound engines with an enforceable logic. Jailbreaks are not just likely, but inevitable. In fact, you’re not ‘breaking into’ anything… you’re just nudging the model into a new context it doesn’t recognize as dangerous.”

The paper pointed out that open-source LLMs are a particular concern, since they can’t be patched once in the wild. “Once an uncensored version is shared online, it is archived, copied, and distributed beyond control,” the authors noted, adding that once a model is saved on a laptop or local server, it is out of reach. In addition, they have found that the risk is compounded because attackers can use one model to create jailbreak prompts for another model.

Read the full article here

You Might Also Like

First-ever zero-click attack targets Microsoft 365 Copilot

20 genuinely useful AI apps for Android – Computerworld

LinkedIn CEO to now also oversee Microsoft Office and M365 Copilot – Computerworld

Microsoft’s new genAI model to power agents in Windows 11 – Computerworld

Meta officially ‘acqui-hires’ Scale AI — will it draw regulator scrutiny? – Computerworld

Share This Article
Facebook Twitter Email Print
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Trending Stories

News

Seattle leaders scrutinize $90M tax plan: Relief for small businesses, higher bills for big tech

July 3, 2025
Games

All active Anime Vanguards codes in July 2025 and how to redeem them

July 3, 2025
Games

Rematch’s developers expected players to develop new tech fast, but ‘not nearly as fast as it is going right now’

July 3, 2025
News

Startup radar: It’s all about AI for early stage Seattle companies in space, storytelling, supply chain

July 3, 2025
Games

Peak devs accidentally released a patch that ‘made a number of players totally unable to play’ so now there’s a new public beta Steam branch for everyone to mess around in safely

July 3, 2025
AI

Why I hope Apple keeps investing in on-device AI – Computerworld

July 3, 2025

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!

Follow US on Social Media

Facebook Youtube Steam Twitch Unity

2024 © Prices.com LLC. All Rights Reserved.

Tech Journal Now

Quick Links

  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?