OpenAI fires back at Google with GPT-5.2 after ‘code red’ memo

December 13, 2025

2 5 minutes read

OpenAI launched its latest frontier model, GPT-5.2, on Thursday amid increasing competition from Google, calling it the most advanced model yet, designed for developers and everyday professional use.

OpenAI’s GPT-5.2 comes to paid users and developers of ChatGPT via its API in three flavors: Instant, a speed-optimized model for routine queries such as information searching, writing, and translation; Thinking, which excels in complex structured work such as coding, analyzing long documents, mathematics and planning; and Pro, the top model aimed at delivering maximum accuracy and reliability for difficult problems.

“We designed 5.2 to unlock even more economic value for people,” OpenAI Chief Product Officer Fidji Simo said during a briefing with journalists on Thursday. “It is better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long contexts, using tools, and linking complex, multi-step projects.”

GPT-5.2 finds itself in the middle of an arms race with Google’s Gemini 3, which tops LMArena’s leaderboard in most benchmarks (aside from encryption – which Anthropic’s Claude Opus-4.5 still has locked).

Beginning this month, The information R reported that CEO Sam Altman released an internal ‘code red’ memo to staff amid the crisis Drop in ChatGPT traffic and worries it is losing consumer market share to Google. The code red called for a shift in priorities, including delaying commitments such as introducing ads and instead focusing on creating a better ChatGPT experience.

GPT-5.2 is OpenAI’s attempt to reclaim leadership, as well as some employees reportedly asked for the model release to be postponed so that the company had more time to improve it. And despite indications that OpenAI would turn its attention to consumer use by adding more personalization and customization to ChatGPT, the launch of GPT-5.2 appears to boost its business opportunities.

The company focuses specifically on developers and the tooling ecosystem, with the aim of becoming the standard foundation for building AI-powered applications. Earlier this week, OpenAI released new data showing that business use of its AI tools has increased dramatically over the past year.

WAN event

San Francisco
|
October 13-15, 2026

This is because Gemini 3 has become tightly integrated into Google’s product and cloud ecosystem for multimodal and agentic workflows. Google this week launched managed MCP servers to help agents more easily connect their Google and Cloud services like Maps and BigQuery. (MCPs are the connectors between AI systems and data and tools.)

OpenAI says GPT-5.2 sets new benchmark scores in coding, math, science, vision, long-context reasoning, and tool usage, which the company claims could lead to “more reliable agentic workflows, production-quality code, and complex systems that work across large real-world contexts and data.”

These capabilities put it in direct competition with Gemini 3’s Deep Think mode, which is being touted as a major advancement in reasoning focused on math, logic and science. On OpenAI’s proprietary benchmark chart, GPT-5.2 Thinking scores better than Anthropic’s Gemini 3 and Claude Opus 4.5 in almost every reasoning test mentioned, from real-world software engineering tasks (SWE-Bench Pro) and doctoral-level scientific knowledge (GPQA Diamond) to abstract reasoning and pattern discovery (ARC-AGI suites).

Research leader Aidan Clark said stronger math scores aren’t just about solving equations. Mathematical reasoning, he explained, is an indication of whether a model can follow multi-step logic, keep numbers consistent over time, and avoid subtle errors that can compound over time.

“These are all features that really matter across a wide range of different workloads,” says Clark. “Things like creating financial models, making predictions, analyzing data.”

During the briefing, OpenAI product lead Max Schwarzer said GPT-5.2 “makes substantial improvements to code generation and debugging” and can step through complex math and logic. Coding startups like Windsurf and CharlieCode, he added, report “state-of-the-art coding performance for agents” and measurable gains on complex multi-step workflows.

In addition to coding, Schwarzer said GPT-5.2 Thinking responses have 38% fewer errors than its predecessor, making the model more reliable for everyday decision-making, research and writing.

GPT-5.2 seems less of a reinvention and more of a consolidation of OpenAI’s last two upgrades. Released in August, GPT-5 was a reset that laid the foundation for a unified system with a router to switch the model between a fast standard model and a deeper “thinking mode.” November’s GPT-5.1 aimed to make that system warmer, chattier, and better suited to agentic and coding tasks. The latest model, GPT-5.2, seems to turn up the dial on all these improvements, making it a more reliable base for production use.

For OpenAI, the stakes have never been higher. The company has made commitments worth $1.4 trillion to build out AI infrastructure over the coming years to support its growth – commitments it made when it still had the first-mover advantage among AI companies. But now that Google, which lagged behind at the start, is pushing ahead, this gamble could well be the reason Altman is driving ‘code red’.

OpenAI’s renewed focus on reasoning models is also a risky flex. The systems behind the Thinking and Deep Research modes are more expensive to use than standard chatbots because they consume more computing power. By expanding those types of models with GPT-5.2, OpenAI could set off a vicious cycle: spend more on computing power to win the rankings, then spend even more to run those expensive models at scale.

OpenAI is reportedly already spending more on computing power than previously thought. As TechCrunch recently reported, most of OpenAI’s spending on computing power to run a trained AI model is paid for in cash rather than through cloud credits, indicating that the company’s compute costs have grown beyond what partnerships and credits can subsidize.

During the conversation, Simo suggested that as OpenAI scales, it can offer more products and services to generate more revenue to pay for additional computing power.

“But I think it’s important to put that in the grand arc of efficiency,” Simo said. “Today you get much more intelligence for the same amount of computing power and the same amount of money as a year ago.”

For all its focus on reasoning, there’s one thing missing from today’s launch: a new image generator. Altman reportedly said in his Code Red memo that image generation would be a top priority going forward, especially after Google’s Nano Banana (the nickname for Google’s Gemini 2.5 Flash Image model) had a viral moment following its release in August.

Last month, Google launched Nano Banana Pro (also known as Gemini 3 Pro Image), an improved version with even better text display, world knowledge and a creepy, lifelike, unedited atmosphere to his photos. It also integrates better into Google’s products, as demonstrated this past week when it shows up in tools and workflows like Google Labs Mixboard for automatically generating presentations.

OpenAI reportedly plans to release another new model in January with better visuals, improved speed and a better personality, although the company did not confirm these plans on Thursday.

OpenAI also said Thursday it is rolling out new safety measures around the use of mental health services and age verification for teens, but didn’t spend much of the launch pitching those changes.

This article has been updated with more information about OpenAI’s compute efficiency status.

Do you have a sensitive tip or confidential documents? We report on the inner workings of the AI industry – from the companies shaping the future to the people affected by their decisions. Contact Rebecca Bellan at rebecca.bellan@techcrunch.com or Russell Brandom at russell.brandom@techcrunch.com. For secure communication, contact them via Signal at @rebeccabellan.491 and russellbrandom.49.

Source link