Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2

Googling issued Thursday released a “redesigned” version of its research agent Gemini Deep Research, based on its much-talked-about state-of-the-art base model, Gemini 3 Pro.
This new agent isn’t just designed to produce investigative reports, although it can still do that. It now offers developers the opportunity to integrate Google’s SATA model research capabilities into their own apps. That possibility is made possible by Google’s new feature Interactions APIwhich is designed to give developers more control in the coming age of agentic AI.
The new Gemini Deep Research tool is an agent equipped to synthesize mountains of information and handle a large context dump in the prompt. Google says it is used by customers for tasks ranging from due diligence to drug safety research.
Google also says it will soon integrate this new deep search agent into services including Google Search, Google Finance, its Gemini app and its popular NotebookLM. This is another step toward preparing for a world where people don’t Google anything anymore — their AI agents do.
The tech giant says Deep Research benefits from Gemini 3 Pro’s status as the “most factual” model trained to minimize hallucinations during complex tasks.
AI hallucinations – where the LLM just makes things up – are a particularly critical problem for long-running, deep agent reasoning tasks, where many autonomous decisions are made in minutes, hours or longer. The more choices an LLM has to make, the greater the chance that even one hallucinated choice will invalidate the entire output.
To prove its progress claims, Google has also created a benchmark (as if the AI world still needs one). The new benchmark is unimaginatively called DeepSearchQA and is intended to test agents on complex, multi-step information-seeking tasks. Google has made this benchmark open source.
WAN event
San Francisco
|
October 13-15, 2026
It also tested Deep Research on Humanity’s Last Exam, a much more interestingly named independent measure of general knowledge full of impossible niche tasks; and BrowserComp, a benchmark for browser-based agent tasks.
As you’d expect, Google’s new agent beat the competition on its own benchmark, and that of Humanity. However, OpenAI’s ChatGPT 5 Pro came in a surprisingly close second place, slightly beating Google on BrowserComp.
But those benchmark comparisons were already outdated by the time Google published them. Because OpenAI launched its long-awaited GPT 5.2 on the same day, codenamed Garlic. OpenAI says its latest model bests its rivals (especially Google) on a range of typical benchmarks, including OpenAI’s own benchmark.
Perhaps one of the most interesting parts of this announcement was the timing. Knowing that the world was waiting for the release of Garlic, Google dropped some AI news of its own.




