AI

Sakana AI's CTO says he's 'absolutely sick' of transformers, the tech that powers every major AI model

In a striking act of self-criticism, one of the architects of the transformer technology that powers ChatGPT, Claudeand virtually every major AI system told an audience of industry leaders this week that research into artificial intelligence has become dangerously limited – and that it is moving beyond its own creation.

Llion Joneswho co-authored the seminal 2017 paper “Attention is all you need‘ and even coined the name ‘transformer’, delivering an unusually candid assessment TED AI conference in San Francisco on Tuesday: Despite unprecedented investment and talent pouring into AI, the field has hardened around a single architectural approach, potentially blinding researchers to the next big breakthrough.

“Despite the fact that there has never been more interest and resources and money and talent, somehow this has led to the limitation of the research we do,” Jones told the audience. The culprit, he argued, is the “tremendous amount of pressure” from investors demanding returns and researchers straining to stand out in a crowded field.

The warning carries particular weight given Jones’ role in AI history. The transformer architecture that he helped develop at Google has become the foundation of the generative AI boom, enabling systems that can write essays, generate images, and have human-like conversations. His paper is gone cited more than 100,000 timesmaking it one of the most influential computer science publications of the century.

Now as CTO and co-founder of the Tokyo-based company Sakana A.IJones explicitly renounces his own creation. “I personally decided at the beginning of this year that I was going to drastically reduce the amount of time I spend on transformers,” he said. “I’m now explicitly looking for the next big thing.”

Why more AI funding has led to less creative research, says a transformer pioneer

Jones painted a picture of an AI research community suffering from what he called a paradox: More resources have led to less creativity. He described researchers constantly checking to see if they were being “picked up” by competitors working on identical ideas, and academics choosing safe, publishable projects over risky, potentially transformative ones.

“If you’re doing standard AI research right now, you have to assume that there are maybe three or four other groups doing something similar, or maybe exactly the same thing,” Jones said, describing an environment where “unfortunately this pressure is hurting science because people are rushing their papers and it reduces the amount of creativity.”

See also  Essential AI Engineer Skills and Tools you Should Master

He drew an analogy with AI itself: the trade-off between ‘exploration and exploitation’ that determines how algorithms search for solutions. When a system over-exploits and under-explores, it finds mediocre local solutions while missing superior alternatives. “We’re almost certainly in that situation in the AI ​​industry right now,” Jones argued.

The consequences are sobering. Jones recalled the period just before the advent of transformers, when researchers were endlessly tweaking recurring neural networks—the previous dominant architecture—for incremental gains. Once the transformers arrived, all that work suddenly seemed irrelevant. “How much time do you think those researchers would have spent improving the recurrent neural network if they knew something like transformers was around the corner?” he asked.

He fears the field will repeat that pattern. “I’m afraid we’re in a situation now where we’re just focusing on one architecture and just permuting it and trying different things, when a breakthrough might be just around the corner.”

How the article ‘Attention is all you need’ was born from freedom, not pressure

To make his point, Jones described the conditions that allowed transformers to come into existence in the first place – a stark contrast to today’s environment. The project, he said, was “very organic, bottom-up,” born from “talking over lunch or randomly scribbling on the whiteboard in the office.”

Crucially, “we didn’t actually have a good idea, we had the freedom to actually spend time on it and start working on it, and most importantly, we didn’t have any pressure from management,” says Jones. “No pressure to work on a certain project, publish a bunch of articles to push up a certain benchmark.”

That freedom, Jones suggested, is largely absent today. Even researchers recruited for astronomical salaries – “literally a million dollars a year in some cases” – may feel unable to take risks. “When they start their new roles, do you think they feel empowered to try out their wild ideas and more speculative ideas, or do they feel enormous pressure to prove their worth and go for the low-hanging fruit again?” he asked.

See also  Startup Battlefield company SpotitEarly trained dogs and AI to sniff out common cancers and will show off its tech at Disrupt

Why an AI lab is betting that research freedom is better than million-dollar salaries

Jones’s proposed solution is deliberately provocative: turn up the explore button and share findings openly, even at a competitive cost. He acknowledged the irony of his position. “It might sound a little controversial to hear one of the Transformers authors stand up on stage and tell you he’s completely fed up with them, but that’s fair enough, right? I’ve been working on it longer than anyone else, with the possible exception of seven people.”

Bee Sakana A.IJones said he’s trying to recreate that pre-transformer environment, with nature-inspired research and minimal pressure to chase publications or compete directly with rivals. He offered researchers a mantra from engineer Brian Cheung: “You should only do research that wouldn’t happen if you didn’t do it.”

An example is Sakana’s “continuous thinking machine“, which integrates brain-like synchronization into neural networks. An employee who pitched the idea told Jones that he would have faced skepticism and pressure not to waste time at previous employers or academic positions. At Sakana, Jones gave him a week to explore it. The project became successful enough to be publicized NeurIPSa major AI conference.

Jones even suggested that freedom is more important than compensation in recruitment. “It’s a very, very good way to acquire talent,” he said of the exploratory environment. “Think about it, talented, intelligent people and ambitious people will naturally seek out this type of environment.”

The success of the transformer could hinder AI’s next breakthrough

Perhaps most provocatively, Jones suggested that transformers may be victims of their own success. “The fact that current technology is so powerful and flexible… kept us from looking for better,” he said. “It makes sense that if current technology were worse, more people would look for better.”

He made it clear that he does not dismiss the ongoing research into transformers. “There is still a lot of important work to be done on today’s technology and that will deliver a lot of value in the coming years,” he said. “I’m just saying that given the amount of talent and resources we have right now, we can afford to do a lot more.”

See also  OpenAI announces New Delhi office as it expands footprint in India

His ultimate message was one of cooperation over competition. “Really, from my perspective, this is not a competition,” Jones concluded. “We all have the same goal. We all want this technology to advance so we can all benefit from it. So if we can all collectively turn up the discovery dial and then openly share what we find, we can achieve our goal much faster.”

The high stakes of AI’s exploration problem

The comments come at a crucial time for artificial intelligence. The industry is grappling with mounting evidence that simply demonstrates building larger transformer models may be approaching diminishing returns. Leading researchers have begun to openly debate whether the current paradigm has fundamental limitations, with some suggesting that architectural innovations – and not just scale – will be necessary for continued progress toward more capable AI systems.

Jones’ warning suggests that finding these innovations may require dismantling the incentive structures that have driven AI’s recent boom. Of Tens of billions of dollars are poured into AI development every year and intense competition among laboratories that encourage secrecy and rapid publication cycles, the exploratory research environment he described appears increasingly remote.

Yet his insider perspective carries unusual weight. As someone who helped create the technology that now dominates the field, Jones understands both what it takes to achieve breakthrough innovation and what the industry risks by abandoning that approach. His decision to walk away from transformers – the architecture that made his reputation – adds credibility to a message that could otherwise sound like contrarian positioning.

Whether AI power players will heed the call remains uncertain. But Jones was a stark reminder of what’s at stake: the next transformer-scale breakthrough could be just around the corner, pursued by researchers with the freedom to explore. Or it could remain unexplored as thousands of researchers rush to publish incremental improvements in the architecture that, in Jones’ words, one of its creators is “absolutely sick of it.”

After all, he’s been working on transformers longer than anyone else. He would know when it is time to move on.

Source link

Back to top button