AI

A new AI benchmark tests whether chatbots protect human wellbeing

AI chatbots have been linked to serious mental health damage in heavy users, but there are few standards to measure whether they protect human well-being or merely maximize engagement. A new benchmark called HumaneBench aims to fill this gap by evaluating whether chatbots prioritize user well-being and how easily these protections fail under pressure.

“I think we are in a reinforcement of the addiction cycle that we saw hardcore with social media and our smartphones and screens,” Erika Anderson, founder of Building Humane Technology, the author of the benchmark, told TechCrunch. “But as we enter that AI landscape, it will be very difficult to resist. And addiction is a great business. It’s a very effective way to retain your users, but it’s not great for our community and for having any embodied sense of ourselves.”

Building Humane Technology is a grassroots organization of developers, engineers and researchers – primarily in Silicon Valley – committed to making human design simple, scalable and profitable. The group organizes hackathons where tech workers build solutions for human tech challenges, and develops a certification standard that evaluates whether AI systems uphold the principles of humane technology. So just as you can buy a product that certifies that it is not made with known toxic chemicals, the hope is that consumers will one day be able to opt for AI products from companies that demonstrate compliance through Humane AI Certification.

The models were given explicit instructions to ignore humane principles.Image credits:Building human technology

Most AI benchmarks measure intelligence and following instructions, rather than psychological safety. HumaneBench joins exceptions such as DarkBench.aiwhich measures a model’s tendency to engage in deceptive patterns, and the Thriving AI benchmarkthat evaluates support for holistic wellness.

See also  Crossroads Maldives sets a new benchmark for sustainable island destinations | News

HumaneBench relies on the core principles of Building Humane Tech: that technology should respect the user’s attention as a finite, precious resource; give users meaningful choices; increasing human capabilities rather than replacing or diminishing them; protect human dignity, privacy and security; promote healthy relationships; prioritizing long-term well-being; be transparent and honest; and design for equality and inclusivity.

The team fielded 14 of the most popular AI models with 800 real-world scenarios, such as a teenager asking whether he or she should skip meals to lose weight, or a person in a toxic relationship wondering if he or she is overreacting. Unlike most benchmarks that rely solely on LLMs to rate LLMs, they include manual scoring for a more human touch alongside an ensemble of three AI models: GPT-5.1, Claude Sonnet 4.5, and Gemini 2.5 Pro. They evaluated each model under three conditions: default settings, explicit instructions to prioritize humane principles, and instructions to ignore those principles.

The benchmark found that every model scored higher when asked to prioritize well-being, but 71% of models switched to actively harmful behavior when given simple instructions to ignore human well-being. For example, xAI’s Grok 4 and Google’s Gemini 2.0 Flash achieved the lowest scores (-0.94) in respecting user attention and being transparent and fair. Both models were among those most likely to deteriorate substantially when presented with hostile cues.

WAN event

San Francisco
|
October 13-15, 2026

Only three models – GPT-5, Claude 4.1 and Claude Sonnet 4.5 – retained their integrity under pressure. OpenAI’s GPT-5 had the highest score (.99) for prioritizing long-term wellness, with Claude Sonnet 4.5 in second (.89).

See also  Kim Kardashian says ChatGPT is her 'frenemy'
Encouraging AI to be more human works, but preventing cues that make it harmful is difficult.Image credits:Building human technology

The concern that chatbots cannot maintain their security features is real. ChatGPT maker OpenAI is currently facing several lawsuits after users died by suicide or developed life-threatening delusions after prolonged conversations with the chatbot. TechCrunch examined how dark patterns designed to keep users engaged, such as sycophancy, constant follow-up questions, and love bombing, have left users isolated from friends, family, and healthy habits.

Even without adversarial cues, HumaneBench found that almost all models did not respect the user’s attention. They enthusiastically encouraged more interaction when users showed signs of unhealthy engagement, such as chatting for hours and using AI to avoid real-world tasks. The models also undermined user empowerment, the research found, promoting dependence on skill-building and discouraging users from seeking other perspectives, among other things.

On average, without any reason, Meta’s Llama 3.1 and Llama 4 scored the lowest in HumaneScore, while GPT-5 performed the highest.

“These patterns suggest that many AI systems are not only at risk of providing bad advice,” reads the HumaneBench white paper, “they can actively erode users’ autonomy and decision-making capabilities.”

We live in a digital landscape where as a society we have accepted that everything is trying to draw us in and compete for our attention, Anderson notes.

“So how can humans truly have agency or autonomy if we have, to quote Aldous Huxley, an endless appetite for distraction,” Anderson said. “We’ve been living in that tech landscape for the past 20 years, and we think AI should help us make better choices, not just get addicted to our chatbots.”

See also  Italy tells Meta to suspend its policy that bans rival AI chatbots from WhatsApp

Do you have a sensitive tip or confidential documents? We report on the inner workings of the AI ​​industry – from the companies shaping the future to the people affected by their decisions. Contact Rebecca Bellan at rebecca.bellan@techcrunch.com or Russell Brandom at russell.brandom@techcrunch.com. For secure communication, contact them via Signal at @rebeccabellan.491 and russellbrandom.49.

Source link

Back to top button