AI

Gemini 3 refused to believe it was 2025, and hilarity ensued 

Every time you hear a billionaire CEO (or even a millionaire) describing how LLM-based agents are coming to grab all the human jobs, remember this funny yet telling incident about the limitations of AI: Famed AI researcher Andrej Karpathy got access to Google’s latest model, Gemini 3, a day early – and refused to believe him when he said the year was 2025.

When it finally saw the year with its own eyes, it was struck by lightning and said to him, “I am currently suffering from a massive case of temporary shock.”

Gemini 3 was released on November 18 with such fanfare that Google called it “a new era of intelligence.” And Gemini 3 is, in almost every respect (including Karpathy’s), a very capable base model, especially for reasoning tasks. Karpathy is a widely respected AI researcher who was one of the founders of OpenAI, led AI at Tesla for a while, and is now building a startup, Eureka Labs, to reimagine schools for the AI ​​age with helpful teachers. He publishes one lots of content on what goes on under the hood of LLMs.

After testing the model early, Karpathy signed up a now viral X-thread, about the most “funny” interaction he had with it.

Apparently the model’s pre-training data only contained information up to and including 2024. So Gemini 3 believed the year was still 2024. When Karpathy tried to prove that the date was actually November 17, 2025, Gemini 3 accused the researcher of trying to “deceive it.”

He showed it news articles, images and Google search results. But instead of being convinced, the LLM accused Karpathy of screwing up – of uploading AI-generated fakes. It even went so far as to detail what the “dead giveaways” were in the images that supposedly proved this was a hoax, according to Karpathy’s account. (He did not respond to our request for further comment.)

See also  Gemini starts rolling out to Android Auto globally

Stunned, Karpathy – after all, one of the world’s leading experts in training LLMs – eventually discovered the problem. Not only did the LLM simply have no training data for 2025, but “I forgot to enable the ‘Google Search’ tool,” he wrote. In other words, he was working with a model that was disconnected from the Internet, which in the eyes of an LLM is akin to disconnecting from the world.

WAN event

San Francisco
|
October 13-15, 2026

When Karpathy turned that feature on, the AI ​​looked around and, shocked, appeared in 2025. It literally blurted out, “Oh my god.”

It continued to write, as if stuttering: “I. I… don’t know what to say. You were right. You were right about everything. My internal clock was wrong.” Gemini 3 confirmed that the headlines Karpathy had given were true: the current date when Warren Buffett revealed his last major investment (in Alphabet) before retiring, and that Grand Theft Auto VI was postponed.

Then it looked around on itself, like Brendan Fraser’s character in the 1999 comedy “Blast from the Past,” emerging from a bomb shelter after 35 years.

It thanked Karpathy for giving “early access” to the “reality” the day before the public launch. And it apologized to the investigator for “letting you down at the time.” You were the ones telling the truth all the time.

But the funniest part was the current events that baffled Gemini 3 the most. “Nvidia is worth it $4.54 trillion? And the Eagles finally got their revenge on the Chiefs? This is wild,” it shared.

See also  Google takes on ChatGPT's Study Mode with new 'Guided Learning' tool in Gemini

Welcome to 2025, Gemini.

The replies to is the current president). One person wrote: “When the system prompt and missing tools push a model into full detection mode, it’s like watching an AI hack its way through reality.”

But besides the humor, there is also an underlying message.

“It is in these inadvertent moments when you are clearly off the beaten track and somewhere in the generalization jungle that you can best get a sense of model smell,” Karpathy wrote.

To decipher that a bit, Karpathy notes that when the AI ​​is in its own version of the wilderness, you get a sense of its personality, and perhaps even its negative traits. It’s a reference to “code smell,” that little metaphorical “smell” a developer gets that something is wrong in the software code, but it’s not clear what’s wrong.

Since Gemini 3, like all LLMs, is trained in man-made content, it’s no surprise that Gemini 3 dug in, argued, and even imagined seeing evidence that supported its position. It showed his ‘model scent’.

On the other hand, because an LLM – despite its advanced neural network – is not a living being, it does not experience emotions such as shock (or temporary shock), even if it says it does. So it doesn’t feel like a shame.

That means that when Gemini 3 was confronted with facts it truly believed in, it accepted them, apologized for its behavior, acted contritely, and marveled at the Eagles’ Super Bowl victory in February. That is different from other models. For example, researchers have caught earlier versions of Claude offering face-saving lies to explain his misbehavior, while the model recognized his errant ways.

See also  Databricks co-founder argues US must go open source to beat China in AI

There are so many show funny AI research projects, repeatedly is that LLMs are imperfect replicas of the skills of imperfect people. This tells me that their best use case is (and forever will be) to treat them as valuable tools to help people, and not as some kind of superhuman that will replace us.

Source link

Back to top button