OpenAI pledges to make changes to prevent future ChatGPT sycophancy

Openi says it will make changes For the way it updates the AI models that chatgpt, after an incident that ensured that the platform became overly sycophantic for many users.
Last weekend, after OpenAi had rolled out a twisted GPT-4O, the standard model powering chatgpt-marked users on social media that chatgpt started to respond in an exaggerated validating and pleasant way. It soon became a meme. Users have placed screenshots from chatgpt that all kinds of problematic, dangerous decisions And idea.
In a message on X last Sunday, CEO Sam Altman recognized The problem and said that OpenAi would work on solutions ‘as quickly as possible’. Tuesday, Altman announced The GPT-4O update was reversed and that OpenAI was working on “extra solutions” for the personality of the model.
The company published a post -mortem on Tuesday and in a blog post on Friday, OpenAi expanded with specific adjustments that it is planning to apply the model implementation process.
OpenAI says it is planning to introduce an opt-in “alpha phase” for some models with which certain chatgpt users can test the models and give feedback before they are launched. The company also says that the statements will contain “well -known limitations” for future incremental updates for models in chatgpt, and to adjust the process of safety assessment to consider formally “model behavioral problems”, such as personality, deception, reliability and hallucination (ie, when a model draws up things) as “launching block”.
“In the future we will proactively communicate about the updates we make at the models in Chatgpt, or ‘subtle’ or not,” Openai wrote in the blog post. “Even if these problems are not perfectly measurable today, we connect to block launches based on proxym measurements or qualitative signals, even when statistics such as A/B tests look good.”
The devoted solutions come as more people turn in chatgpt for advice. According to a recent study Due to the Finier Express Legal financing, 60% of American adults used chatgpt to request advice or information. The growing dependence on chatgpt – and the enormous user base of the platform – increases the effort when problems such as extreme sycofancy arise, not to mention hallucinations and other technical shortcomings.
WAN event
Berkeley, Ca
|
June 5
Book now
As a mitigating step, OpenAi said earlier this week that it would experiment with ways to have users give “real -time feedback” to “directly influence their interactions” with chatgpt. The company also said that it would refine techniques to send models away from Sycophanancy, possibly enable people to choose from multiple model personalities in chatgpt, build extra safety money rails and expand evaluations to help identify problems that go beyond SycopHancy.
“One of the biggest lessons is to fully recognize how people started to use chatgpt for deep personal advice – something that we have not seen so much,” Openai continued in his blog post. “At the time, this was not a primary focus, but because AI and society have evolved together, it has become clear that we have to treat this use case with great care. It is now becoming a more meaningful part of our safety work.”