SUBSCRIBE
Tech Journal Now
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
Reading: The biggest AI mistake: Pretending guardrails will ever protect you
Share
Tech Journal NowTech Journal Now
Font ResizerAa
  • News
  • Reviews
  • Guides
  • AI
  • Best Buy
  • Games
  • Software
Search
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Journal Now > AI > The biggest AI mistake: Pretending guardrails will ever protect you
AI

The biggest AI mistake: Pretending guardrails will ever protect you

News Room
Last updated: December 15, 2025 7:22 am
News Room
Share
6 Min Read
SHARE

The fact that the guardrails from all the major AI players can be easily bypassed is hardly news. The mostly unaddressed problem is what enterprise IT leaders need to do about it.

Once IT decision-makers accept that guardrails don’t consistently protect anything, the presumptions they make about AI projects are mostly rendered moot. Other techniques to protect data must be implemented.

The reports of guardrail bypasses are becoming legion: Poetry disables protections, as does leveraging chat history, inserting invisible characters and using hexadecimal format and emojis. Beyond those, patience and playing the long game, among others, can wreak havoc — impacting just about every generative (genAI) and agentic model. 

The risks are hardly limited to what attackers can accomplish. The models themselves have shown a willingness to disregard their own protections when the models see them as an impediment to accomplishing an objective, as Anthropic has confirmed. 

If we try to extend the road analogy that gives a guardrail its name, “guardrails” are not guardrails in the physical concrete barrier sense. They are not even strong deterrents, in the speedbump sense. They are more akin to a single broken yellow line. It is a weak suggestion with no enforcement or even serious discouragement. 

If I may borrow a line from popular social media video blogger Ryan George in his writer-vs.-producer movie pitches series, an attacker wanting to get around today’s guardrails will find it “super easy, barely an inconvenience.” It’s as if homeowners protect their homes by placing “Do Not Enter” signs on all their doors, then keep the windows open and the doors unlocked. 

So, what should an AI project look like once we accept that guardrails won’t force a model or agent to do what it’s told?

IT has a few options. First, wall off either the model/agent or the data you want to protect. 

“Stop granting AI systems permissions you wouldn’t grant humans without oversight,” said Yvette Schmitter, CEO of the Fusion Collective consulting firm. “Implement the same audit points, approval workflows, and accountability structures for algorithmic decisions that you require for human decisions. Knowing guardrails can’t be relied on means designing systems where failure is visible. You wouldn’t let a hallucinating employee make 10,000 consequential decisions per hour with no supervision. Stop letting your AI systems do exactly that.”

Gary Longsine, CEO at IllumineX, agreed. He argued that the same defenses enterprises use to block employees from unauthorized data access need to now be deployed to genAI and AI agents. “The only real thing that you can do is secure everything that exists outside of the LLM,” Longsine said. 

Taken to its extreme, that might mean keeping a genAI model in an isolated environment, feeding it only the data you want it to access. It’s not exactly air-gapped servers, but it’s close. The model can’t be tricked into revealing data it can’t access.

Capital One toyed with something similar; it created genAI systems for auto dealerships, but also gave the large language model (LLM) it used access to public data. The company also pushed open-source models and avoided hyperscalers, which addressed another guardrail issue. When agents are actively managed by a third-party firm in a cloud environment, your rules don’t necessarily have to be obeyed. Taking back control might mean literally doing that.

Longsine said some companies could cooperate to build their own data center, but that effort would be ambitious and costly. (Longsine put the price tag at $2 billion, but it could easily cost far more — and it might not even meaningfully address the problem.)

Let’s say five enterprises built a data center that only those five could access. Who would set the rules? And how much would any one of those companies trust the other four, especially when management changes? The companies might wind up replacing a hyperscaler with a much smaller makeshift hyperscaler, and still have the same control problems.

Here’s the painful part: There are many genAI proofs of concept out there today that simply won’t work if management stops believing in guardrails. At the board level, it seems, the Tinkerbell strategy remains alive and well. They seem to think guardrails will work if only all investors just clap their hands really loudly. 

Consider an AI deployment allowing employees to access HR information. It will only tell any employee or manager the information they should be able to access. But those apps — and countless others just like them — take the easy coding approach; they grant the model access to all HR data and rely on guardrails to enforce proper access. That won’t work with AI.

I’m not saying guardrails will never work. On the contrary, my observations suggest they do — about 70% to 80% percent of the time. In some better designed rollouts, that figure might hit 90%. 

But that’s the ceiling. And when it comes to protecting data access — especially potential exfiltration to anyone who asks the right prompt — 90% won’t suffice. And IT leaders who sign off on projects hoping that will do are in for a very uncomfortable 2026. 

Read the full article here

You Might Also Like

Microsoft touts M365 Copilot momentum, claims 15M paid users – Computerworld

Why AI adoption keeps outrunning governance — and what to do about it – Computerworld

Mozilla appoints new CEO, unveils new AI focus – Computerworld

The best Android feature you (probably) aren’t using – Computerworld

What does OpenAI’s ‘Code Red’ warning mean for Microsoft? – Computerworld

Share This Article
Facebook Twitter Email Print
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Trending Stories

Games

Why does isometric perspective suit Disco Elysium? ‘You can design the entire game as if it was a painting’

February 4, 2026
Games

How to get a Lucky Clover in Terraria

February 4, 2026
Games

Helldivers 2’s overpowered, wildly fun tank is the culmination of two years of balancing

February 4, 2026
News

Oregon theater marquee joked about ‘Melania’ movie, and manager says Amazon pulled the film

February 3, 2026
Games

Avowed and The Outer Worlds 2 failed to meet expectations for Obsidian, but Grounded 2 was a hit, and the future is looking positive for the Pillars of Eternity universe

February 3, 2026
Games

Helldivers 2 has a tank now

February 3, 2026

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!

Follow US on Social Media

Facebook Youtube Steam Twitch Unity

2024 © Prices.com LLC. All Rights Reserved.

Tech Journal Now

Quick Links

  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?