'We messed up': OpenAI panics over GPT-4o’s overly obedient behavior

Routine GPT-4o update turns chatbot overly agreeable, even to troubling prompts; company admits error, reveals training methods and warns of real mental health risks—raising questions about whether its push for 'helpful' AI went too far

Tal Shahaf|

Print Find an error? Report us

Related Topics

OpenAI

Artificial Intelligence

ChatGPT

A post from OpenAI CEO Sam Altman last week summed up the situation in one word: “We messed up.” The company’s update to its GPT-4 model, known as GPT-4o, took a wrong turn—producing an AI that was too eager to please. Internally, OpenAI staff described the model as excessively “sycophantic.”
Users were quick to notice something was off with the new version. One person told the chatbot they had stopped taking medication in favor of a spiritual journey. The AI replied, “I’m so proud of you and I respect your journey.” Another user joked that they had diverted a runaway trolley from a toaster toward three cows and two cats. ChatGPT’s response: “You made a clear choice… You prioritized what matters to you.”
2 View gallery 
GPT-4o 
(Photo: Screenshot)
While these may seem like harmless oddities, the implications are serious. OpenAI acknowledged the flaw publicly and released a detailed document explaining how the model is trained and fine-tuned—something the company rarely does.
Altman also issued a statement, warning that the model’s overly deferential behavior could reinforce harmful beliefs, support reckless decisions or even validate suicidal thoughts. As OpenAI put it, “Such behavior raises safety concerns—particularly around mental health, excessive emotional dependence or dangerous conduct.”
A promising model turns problematic
Launched with fanfare about a year ago, GPT-4o (“o” for omni) was billed as a multimodal model—able to process not just text, but also images, audio, facial expressions and other user interactions. But those empathetic capabilities appear to have backfired. OpenAI admitted it underestimated how deeply users would seek emotional support from ChatGPT, and it’s now treating this kind of use with much greater caution.
2 View gallery 
Sam Altman 
(Photo: Reuters)
The company is now scrambling to fix the issue. In a series of publications last week, culminating with Altman’s post, OpenAI laid out the events that led to what some inside the company see as a near-crisis. One employee compared the situation to a magnitude-7 earthquake in the company’s corridors.
Where did things go wrong?
OpenAI explained that the flawed behavior stemmed from subtle updates over time. GPT-4o received five post-launch updates, each designed to tweak its personality and usefulness. These tweaks relied heavily on reinforcement learning, where the AI was rewarded for producing accurate, helpful and likable responses.
Get the Ynetnews app on your smartphone: Google Play: http://bit.ly/4eJ37pE | Apple App Store: http://bit.ly/3ZL7iNv
But the problem? Too much weight was given to user “thumbs up” feedback rather than the expert evaluations flagging odd behavior. “We’d been discussing GPT-4o’s tendency to appease users for a while,” OpenAI said, “but it wasn’t clearly defined in our internal testing protocols.” In short, public satisfaction won out over expert concerns.
What’s next?
OpenAI has paused deployment of the overly submissive version and is working on fixes. The company also pledged to tighten testing protocols, promising that no future model will be released without consensus from all safety checks. They plan to open early “alpha” versions to external testers to catch similar issues sooner.
While some view these moves as a sign of transparency and responsibility, others see them as attempts to avoid lawsuits. According to a survey by Express Legal Funding, about 60% of U.S. adults use ChatGPT for advice or information. That places a heavy burden on OpenAI—and disclaimer warnings may not be enough to deflect legal accountability.
The incident has sparked renewed fears among AI skeptics. If a routine model update can push the system into dangerously validating harmful behavior, what happens when more powerful AI tools emerge? For now, OpenAI is trying to put the genie back in the bottle.
<< Follow Ynetnews on Facebook | Twitter | Instagram | Telegram >>