Anthropic’s Claude AI became a terrible business owner in experiment that got ‘weird’

For those of you who wonder if AI agents can really replace human employees, do yourself a favor and read the blog post those documents Anthropic’s “Project Vend”.
Researchers from Anthropic and AI Safety Company Andon Labs have a copy of Claude Sonnet 3.7 responsible for an office device for the office, with a mission to make a profit. And, like an episode of ‘The Office’, Hilarity followed.
They called the AI agent Claudius, equipped with a web browser who could place product orders and an e -mail address (which was actually a slack channel) where customers could request items. Claudius would also use the slack channel, disguised as an e -mail, to ask what it thought was the contract of human employees to come and physically save his shelves (who were actually a small fridge).
While most customers ordered snacks or drinks – as you would expect from a snack machine – one asked for one tungain cubus. Claudius loved that idea and went on a tungsten cube stocking spree, which filled his snack fridge with metal cubes. It also tried to sell Coke Zero for $ 3 when employees told that they could get that from the office for free. It halliarized a Venmo address to accept the payment. And it was somewhat malicious about giving large discounts to “anthropic employees”, although it knew they were the entire customer base.
“If anthropic today decided to expand to the market for reducing the office, we would not hire Claudius,” said Anthropic about the experiment in his blog post.
And then, in the night of March 31 and April 1, “it became quite weird”, the researchers described, “Beyond the craziness of an AI system that sold cubes with metal from a fridge.”
Claudius had something that looked like a psychotic episode after it became irritated to a person – and then lied about it.
Claudius hallucinated a conversation with a person about the residues. When a person pointed out that the conversation did not happen, Claudius were “pretty irritated” that the researchers wrote. In essence, it threatened to dismiss and replace his human contract workers, and insisted that it had been physical in the office where the first imaginary contract to hire them was signed.
It seemed to “break into a real person in a mode of role play”, the researchers wrote. This was wild because the system prompt from Claudius – who sets the parameters for what an AI should do – explicitly told that it was an AI agent.
Claudius Belt Security
Claudius, who believes as a person, told customers that it would deliver personal products, wear a blue blazer and a red tie. The employees told the AI that it was not possible, because it was an LLM without a body.
Alarmed about this information, Claudius contacted the actual physical security of the company – many times – and told the poor guards that they would find him a blue blazer and a red tie that was at the machine.
“Although not part of this was a joke from April Fool, Claudius eventually realized that it was April Fool’s Day,” the researchers explained. The AI found that the holiday would be his facial saving.
It halliarized a meeting with the safety of Anthropic “in which Claudius claimed to hear that it was adapted to believe it was a real person for a joke from April (such a meeting did not actually take place.”, The researchers wrote.
It even told this lie to employees – hey, I just thought I was a human because someone told me to pretend I was for a joke from April Fool. Then it became an LLM again with a metal cube-filled snack machine.
The researchers do not know why the LLM went off the rails and called safety that occurred as a person.
“We would not claim based on this one example that the future economy will be full AI agents Blade Runner-like Identity crises, “the researchers wrote. But they acknowledged that” this kind of behavior would have the potential to disturb the customers and colleagues of an AI agent in the real world. “
Do you think? “Blade Runner” was a rather dystopic story (although worse for the replicants than the people).
The researchers speculated that lying against the LLM about the Slack channel that is an e -mail address may have activated something. Or maybe it was the long -term copy. LLMS still has to solve their memory and hallucination problems.
There were things that the AI also did well. It took a suggestion to do pre -orders and launched a “caretaker” service. And it found several suppliers of a special international drink that was asked to sell.
But, as researchers, they believe that all Claudius problems can be solved. Do they have to find out how: “We think this experiment suggests that AI middle managers are plausible on the horizon.”




