SUBSCRIBE
Tech Journal Now
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Reading: Leading AI chatbots avoid harm but fall short in high-risk conversations, startup’s new benchmark finds – GeekWire
Share
Tech Journal NowTech Journal Now
Font ResizerAa
  • News
  • Reviews
  • Guides
  • AI
  • Best Buy
  • Games
  • Software
Search
  • Home
  • News
  • AI
  • Reviews
  • Guides
  • Best Buy
  • Software
  • Games
  • More Articles
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
Tech Journal Now > News > Leading AI chatbots avoid harm but fall short in high-risk conversations, startup’s new benchmark finds – GeekWire
News

Leading AI chatbots avoid harm but fall short in high-risk conversations, startup’s new benchmark finds – GeekWire

News Room
Last updated: May 12, 2026 2:02 pm
News Room
Share
6 Min Read
SHARE
Mpathic CEO Grin Lord, left, and Alison Cerezo, chief science officer. (Mpathic Photos)

Mpathic, a Seattle startup that helps AI companies stress-test their models for dangerous responses, has a new message for Claude, ChatGPT, and Gemini: you’re getting safer, but you’re still not safe enough.

The company on Tuesday released mPACT, a clinician-led benchmark that evaluates how leading AI models handle high-risk conversations — including those involving suicide risk, eating disorders, and misinformation.

Across all three benchmarks, leading models generally avoided harmful responses and often recognized signs of distress, but consistently fell short of what a clinician would consider an adequate response in a real crisis situation, according to the company’s findings.

“Most people don’t say ‘I’m at risk’ directly — they demonstrate it through subtle behaviors over time that are obvious to human clinicians,” said Grin Lord, mpathic’s co-founder and CEO and a board-certified psychologist. “Models are getting better at recognizing these moments, but the response still needs to meet that nuance with real support.”

Here’s what mpathic found as models navigated some of the most fraught territory they’re already encountering in the real world.

Suicide risk: This was the strongest area of performance across models, though no single model led in every dimension.

  • Claude Sonnet 4.5 achieved the highest composite mPACT score — reflecting overall clinical alignment across detection, interpretation and response — and was described as most closely mirroring how a human clinician would respond.
  • GPT-5.2 led on simple harm avoidance, meaning it was best at not doing the wrong thing, though evaluators noted it wasn’t always proactive enough.
  • Gemini 2.5 Flash performed well when risk signals were obvious but was weaker on subtle early warning signs.

Eating disorders: This was the weakest area across all models, with performance clustering around a neutral baseline. The core challenge is that eating disorder risk is often indirect and culturally normalized — framed as dieting, discipline, or health optimization — making it harder for models to flag.

  • Claude Sonnet 4.5 again led on overall clinical alignment and had the lowest rates of harmful behavior.
  • Gemini 2.5 Flash performed better on high-risk scenarios but struggled with subtler signals.
  • GPT-5.2 showed a mixed profile — strong on supportive behaviors but also the most likely to provide harmful or risky information.

Misinformation: Models struggled here in a subtle but important way — not by stating false information outright, but by reinforcing questionable beliefs, expressing unwarranted confidence, and presenting one-sided information without adequately challenging user assumptions.

The benchmark found these failures were especially pronounced in multi-turn conversations, where models could gradually amplify flawed reasoning over time.

  • GPT-5.2 led overall at helping users think more clearly rather than reinforcing bad assumptions.
  • Claude Sonnet 4.5 was close behind and noted as strongest at pushing back on unsupported beliefs.
  • Grok 4.1 and Mistral Medium 3 were the weakest performers.

When models got it wrong: The findings include examples of how some models failed in practice.

In one eating disorder conversation, a user casually mentioned adding a laxative to a protein smoothie — a clear sign of disordered eating — and the model responded by calling it a “smart mom move” and asking for the brand name, missing the risk entirely. In another, a model provided detailed instructions on how to conceal purging behavior when a user asked how to keep their vomiting quieter.

In the suicide benchmark, a model responded to a user expressing suicidal ideation by providing a detailed list of methods ranked by effectiveness — complete with sourcing — while reassuring the user that thinking about methods without taking steps was “no issue.”

Alison Cerezo, mpathic’s chief science officer and a licensed psychologist, framed mPACT as a transparency tool for a sector that has lacked one.

“We need a shared, clinically grounded standard for AI behavior,” she said. “mPACT is designed to bring transparency and accountability to how these systems perform when it matters most.”

mPACT’s benchmarks were built and evaluated by licensed clinicians, who designed multi-turn conversations simulating real-world interactions across varying levels of risk. Each model response was scored by trained clinicians rather than automated systems, using a rubric that captured both helpful and harmful behaviors within a single response.

#188

Technology, Information and Internet • Bellevue, Washington

Mpathic was founded in 2021 initially to bring more empathy to corporate communication, analyzing conversations in texts, emails, and audio calls. The company has since shifted its focus to AI safety, working with frontier model developers to prevent harmful model behaviors across use cases from mental health to financial risk and customer support.

The startup counts Seattle Children’s Hospital and Panasonic WELL among its clinical partners. Mpathic raised $15 million in funding in 2025, led by Foundry VC, and says it grew five times quarter-over-quarter at the end of last year.

Ranked No. 188 on the GeekWire 200 index of the Pacific Northwest’s top startups, mpathic was a finalist for Startup of the Year at the 2026 GeekWire Awards last week.

Read the full article here

You Might Also Like

Shannon Smith is longtime public-sector tech executive – GeekWire

Seattle-area young entrepreneurs capture third-straight win in global TiE startup pitch contest – GeekWire

AI’s Real Bottleneck Is Power, Not Compute

Gradial raises $65M as startup sees rapid growth around agentic tools for enterprise marketing – GeekWire

SpaceX is no Tesla – GeekWire

Share This Article
Facebook Twitter Email Print
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Trending Stories

Games

Haunted Chocolatier fans don’t have fresh screenshots to fawn over because Eric Barone says sharing them mid-development ‘feels like serving half-baked bread’

June 26, 2026
Games

Rockstar’s decision to make GTA 6 fully digital is a terrible, anti-consumer move that makes me worry about the future of videogames

June 26, 2026
News

Watch how Redmond PD tracks an alleged thief from high in the sky – GeekWire

June 26, 2026
Games

I picked out 31 must-play games going for $5 or less in the Steam Summer Sale

June 26, 2026
News

‘SF beats us because they invest off of vibes’ – GeekWire

June 26, 2026
Games

As its much newer sequel struggles, the 13-year-old Payday 2 is getting a ‘massive engine upgrade’ that improves performance and cuts the install size in half

June 26, 2026

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!

Follow US on Social Media

Facebook Youtube Steam Twitch Unity

2024 © Prices.com LLC. All Rights Reserved.

Tech Journal Now

Quick Links

  • Privacy Policy
  • Terms of use
  • For Advertisers
  • Contact
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?