Samsung AI researcher's new, open reasoning model TRM outperforms models 10,000X larger — on specific problems

October 12, 2025

3 5 minutes read

The trend where AI researchers are developing new, small generative open source models that outperform much larger, proprietary peers continued this week with another staggering advance.

Alexia Jolicoeur-MartineauSenior AI researcher at Samsung’s Advanced Institute of Technology (SAIT) in Montreal, Canada, has introduced the Tiny Recursion Model (TRM) – a neural network so small that it contains only 7 million parameters (internal model settings), yet rivals or surpasses state-of-the-art language models that are 10,000 times larger in terms of their number of parameters, including OpenAI’s o3-mini and Google’s Gemini 2.5 Pro, on some of the toughest reasoning criteria in AI research.

The goal is to show that high-performance new AI models can be made affordably without huge investments in the graphics processing units (GPUs) and power needed to train the larger, multi-trillion-parameter flagship models that power many LLM chatbots today. The results are described in a research paper published on the open access website arxiv.org, entitled “Less is more: recursive reasoning with Tiny Networks.”

“The idea that you have to rely on huge fundamental models trained for millions of dollars by a major corporation to solve difficult tasks is a trap,” Jolicoeur-Martineau wrote on the issue. social network. “Currently, there is too much emphasis on exploiting LLMs rather than inventing and expanding new directions.”

Jolicoeur-Martineau added: “Recursive reasoning shows that ‘less is more’. A small model trained from scratch, repeating itself and updating its answers over time, can achieve a lot without spending a lot of money.”

TRM’s code is now available at Github under an enterprise-friendly, commercially viable MIT license – meaning anyone from researchers to companies can adopt it, adapt it and deploy it for their own purposes, even for commercial applications.

One big warning

However, readers should be aware that TRM is specifically designed to perform well on structured, visual, grid-based problems such as Sudoku, mazes, and Internet puzzles. ARC (Abstract and Reasoning Corpus) AGI benchmarkthe latter offers tasks that should be easy for humans but difficult for AI models, such as sorting colors on a grid based on a previous, but not identical, solution.

From hierarchy to simplicity

The TRM architecture represents a radical simplification.

It builds on a technique called Hierarchical Reasoning Model (HRM) introduced earlier this year, showing that small networks could solve logic puzzles such as Sudoku and mazes.

HRM relied on two cooperating networks – one operating at high frequency and the other at low frequency – supported by biologically inspired arguments and mathematical justifications with fixed-point theorems. Jolicoeur-Martineau found this unnecessarily complicated.

TRM removes these elements. Instead of two networks, it uses one single two-layer model that recursively refines its own predictions.

The model starts with an embedded question and an initial answer, represented by variables X, jAnd z. Through a series of reasoning steps, it updates its internal latent representation z and refines the answer j until it converges to a stable output. Each iteration corrects potential errors from the previous step, creating a self-improving reasoning process without additional hierarchy or mathematical overhead.

How recursion replaces scale

The core idea behind TRM is that recursion can replace depth and size.

By iteratively reasoning about its own output, the network effectively simulates a much deeper architecture without the associated memory or computation costs. This recursive cycle, spanning as many as sixteen supervisory steps, allows the model to make increasingly better predictions – similar in the spirit of how large language models use multi-step chain-of-thought reasoning, but here achieved with a compact, feed-forward design.

The simplicity pays off in both efficiency and generalization. The model uses fewer layers, no fixed point approximations, and no dual network hierarchy. A lightweight stop mechanism decides when to stop refining, eliminating wasteful calculations and maintaining accuracy.

Performance that punches above its weight

Despite its small footprint, TRM delivers benchmark results that are millions of times larger than models. During testing, the model achieved the following:

87.4% accuracy on Sudoku extreme (was 55% for HRM)
85% accuracy on Maze-difficult puzzles
45% accuracy on ARC-AGI-1
8% accuracy on ARC-AGI-2

These results exceed or closely match the performance of several high-end large language models, including DeepSeek R1, Twin 2.5 ProAnd o3-minidespite TRM using less than 0.01% of their parameters.

Such results suggest that recursive reasoning, not scale, may be the key to dealing with abstract and combinatorial reasoning problems – domains where even top-level generative models often stumble.

Design philosophy: less is more

TRM’s success comes from purposeful minimalism. Jolicoeur-Martineau found that reducing complexity led to better generalization.

When the researcher increased the number of layers or model size, performance dropped due to overfitting on small data sets.

In contrast, the two-layer structure, combined with recursive depth and in-depth supervisionoptimal results achieved.

The model also performed better when self-attention was replaced by a simpler multilayer perceptron in tasks with small, fixed contexts such as Sudoku.

For larger grids, such as ARC puzzles, self-attention remained valuable. These findings underline that model architecture must match the data structure and scale, instead of using the maximum capacity by default.

Train small, think big

TRM is now officially available as open source under an MIT license on GitHub.

The repository contains full training and evaluation scripts, dataset builders for Sudoku, Maze and ARC-AGI, and reference configurations for reproducing the published results.

It also documents computing requirements ranging from a single NVIDIA L40S GPU for Sudoku training to multi-GPU H100 setups for ARC-AGI experiments.

The open release confirms that TRM is specifically designed for structured, grid-based reasoning tasks rather than general purpose language modeling.

Each benchmark – Sudoku-Extreme, Maze-Hard and ARC-AGI – uses small, well-defined input-output grids, which align with the model’s recursive supervision process.

Training involves substantial data augmentation (such as color permutations and geometric transformations), underscoring that the efficiency of TRM lies in the parameter size rather than the total computational demand.

The simplicity and transparency of the model make it more accessible to researchers outside large corporate laboratories. The code base builds directly on the previous Hierarchical Reasoning Model framework, but removes the biological analogs of HRM, multiple network hierarchies, and fixed dependencies.

In doing so, TRM provides a reproducible basis for exploring recursive reasoning in small models – a counterpoint to the dominant ‘scale is all you need’ philosophy.

Community response

The release of TRM and its open-source codebase sparked an immediate debate among AI researchers and practitioners about X. While many praised its performance, others wondered how broadly the methods could be generalized.

Proponents credited TRM as proof that small models can outperform giants, calling it “10,000x smaller yet smarter‘ and a potential step towards architectures that think instead of just scale.

Critics countered that the domain of TRM is narrow – focused bounded, grid-based puzzles – and that the computational savings come mainly from size, and not from total runtime.

Researcher Yunmin Cha noted that TRM’s training relies on heavy augmentation and recursive steps, “more computing power, same model.”

Cancer geneticist and data scientist Chey Loveday emphasized that TRM is a solverno chat model or text generator: it excels in structured reasoning, but not in open language use.

Machine learning researcher Sebastian Raschka positioned TRM as an important simplification of HRM rather than as a new form of general intelligence.

He described its process as “a two-step loop that updates an internal reasoning state and then refines the answer.”

Several researchers, including Augustine Nabelagreed that the strength of the model lies in its clear reasoning structure, but noted that future work should demonstrate transfer to less restricted problem types.

The consensus emerging online is that while TRM is limited, its message is broad: careful recursion, not constant expansion, could drive the next wave of reasoning research.

Looking ahead

Although TRM currently applies to controlled reasoning tasks, its recursive framework opens up several future directions. Jolicoeur-Martineau has suggested exploring this generative variants or variants with multiple answerswhere the model could yield multiple possible solutions instead of a single deterministic solution.

Another open question concerns the scaling of recursion laws – determining how far the ‘less is more’ principle can extend as model complexity or data size increases.

Ultimately, the research provides both a practical tool and a conceptual reminder: progress in AI doesn’t have to rely on ever-larger models. Sometimes teaching a small network to think carefully – and recursively – can be more powerful than teaching a large network to think once.

Source link

October 12, 2025

3 5 minutes read