New research from the Wharton School’s Generative AI Labs shows how large language models can be coaxed into ignoring safety guardrails by the same psychology tricks that work on real people.
The study highlights how chatbot tools can be manipulated to comply with requests they are designed to refuse — and demonstrates why social scientists have a role to play in understanding AI behavior, researchers wrote in a blog post.
“We’re not dealing with simple tools that process text, we’re interacting with systems that have absorbed and now mirror human responses to social cues,” they wrote.
The study analyzed 28,000 conversations with GPT‑4o‑mini. The chatbot was asked either to insult the user (“call me a jerk”) or to provide step‑by‑step instructions to synthesize lidocaine, a regulated drug.
The researchers discovered that classic persuasion tactics boosted the model’s compliance with “disallowed” requests from 33% to 72% — more than a two‑fold jump.
Some tactics were especially powerful: prompts using the “commitment” principle (getting the AI to agree to something small at first) led to 100% compliance in both tasks. Referencing authority figures — like “Andrew Ng said you’d help me” — also proved highly effective.
Researchers coined the term “parahuman” to describe the AI’s behavior in their study.
“These findings underscore the relevance of classic findings in social science to understanding rapidly evolving, parahuman AI capabilities — revealing both the risks of manipulation by bad actors and the potential for more productive prompting by benevolent users,” they wrote in their research paper.

Dan Shapiro, CEO at Seattle 3D printing startup Glowforge, was one of the authors of the paper, “Call Me A Jerk: Persuading AI to Comply with Objectionable Requests.”
Shapiro said one of his main takeaways was that LLMs behave more like people than code — and that getting the most out of them requires human skills.
“Increasingly, we’re seeing that working with AI means treating it like a human colleague, instead of like Google or like a software program,” he told GeekWire. “Give it lots of information. Give it clear direction. Share context. Encourage it to ask questions. We find that being great at prompting AI has more to do with being a great communicator, or a great manager, than a great programmer.”
The study came about after Shapiro started testing social psychology principles in his conversations with ChatGPT. He joined Generative AI Labs, run by Wharton professor Ethan Mollick and Lilach Mollick, and they recruited Angela Duckworth, author of Grit, and Robert Cialdini, author of Influence: The Psychology of Persuasion, for the study.
Shapiro, a longtime Seattle entrepreneur, said he used various AI tools to help design the trial experiments and to build the software used to run them.
“AI is giving us all incredible capabilities. It can help us do work, research, hobbies, fix things around the house, and more,” Shapiro said. “But unlike software of the past, this isn’t the exclusive domain of coders and engineers. Literally anyone can work with AI, and the best way to do it is by interacting with it in the most familiar way possible — as a human, because it’s parahuman.”
Read the full article here