AI

Ai2’s Olmo 3 family challenges Qwen and Llama with efficient, open reasoning and customization

The Allen Institute for AI (Ai2) hopes to capitalize on the increased demand for custom models and companies looking for greater transparency of AI models with the latest release.

Ai2 has made the latest addition to its Olmo family of large language models available to organizations, while continuing to focus on openness and customization.

Olmo 3 has a longer context window, more reasoning trails, and is better at coding than the previous iteration. This latest version, like the other Olmo releases, is open source under the Apache 2.0 license. Companies gain full transparency and control over training data and checkpoints.

Ai2 will release three versions of Olmo 3:

  • Olmo 3- Think in both 7B and 32B are considered the most important reasoning models for advanced research

  • Olmo 3- Base also in both parameters, which is ideal for programming, comprehension, mathematics and reasoning in long contexts. Ai2 said this version is “ideal for continued pre-training or refinement

  • Olmo 3-Instruct in 7B that is optimized for instruction following, multi-turn dialogue and tool use

The company said Olmo 3-Think is the “first ever fully open 32B thinking model that generates explicit reasoning chain-style content.” Olmo-3 Think also has a long context window of 65,000 tokens, perfect for longer-term agentic projects or reasoning about longer documents.

Noah Smith, Ai2’s senior director of NLP research, told VentureBeat in an interview that many of its customers, from regulated enterprises to research institutions, want to use models that give them certainty about what went into the training.

“The releases from our friends in the tech world are very cool and super exciting, but there are a lot of people for whom data privacy controls what goes into the model, how the models are trained, and other limitations on how the model can be used as a front of mind,” said Smith.

See also  Is your startup's check engine light on? Google Cloud's VP explains what to do

Developers can access the models on Hugging Face and the Ai2 Playground.

Transparency and customization

Smith said models like Olmo 3, which the company believes any organization using its models should have control over and mold in the way that works best for them.

“We don’t believe in one-size-fits-all solutions,” said Smith. In the world of machine learning, it is well known that if you try to build a model that solves all problems, it ultimately isn’t really the best model for any problem. There is no formal evidence for this, but it is something that old-timers like me have observed.”

He added that models with the ability to specialize “may not be as flashy as getting high scores on math exams,” but offer more flexibility for companies.

With Olmo 3, companies can essentially retrain the model by adding the data mix they are learning from. The idea is that companies can use their own resources to guide the model in answering specific business questions. To help companies through this process, Ai2 has added checkpoints from each major training phase.

The demand for model customization has grown as companies that cannot build their own LLMs want to create company-specific or industry-specific models. Startups like it Arcee to have started offering enterprise-oriented, adaptable small models.

Models like Olmo 3 also give companies more confidence in the technology, according to Smith. Because Olmo 3 provides the training data, Smith says companies can trust that the model hasn’t included anything it shouldn’t.

Ai2 has always claimed to be committed to greater transparency, and has even launched a tool called OlmoTrace in April which allows a model’s output to be traced directly back to the original training data. The company releases open source models and places the code in repositories such as GitHub so that anyone can use it.

See also  Country Songwriter Brett James and family killed at Crash

Competitors such as Google and OpenAI have done so received criticism from developers about movements that hid raw reasoning sheets and opted to summarize the reasoning, claiming they are now resorting to “blind debugging” without transparency.

Ai2 pre-trained Olmo 3 on the six trillion token open source dataset Dolma 3. The dataset includes web data, scientific literature and code. Smith said they optimized Olmo 3 for code, compared to the math focus for Olmo 2.

How it stacks up

Ai2 claims that the Olmo 3 family of models represents a significant leap forward for truly open source models, at least for open source LLMs developed outside of China. The base Olmo 3 model trained “with roughly 2.5x greater compute efficiency, as measured by GPU hours per token,” meaning it consumed less energy and cost less during pre-training.

The company said the Olmo 3 models outperformed other open models, such as Stanford’s Marin, LLM360’s K2 and Apertus, although Ai2 did not provide figures for the benchmark tests.

“Note that Olmo 3-Think (32B) is the strongest fully open reasoning model, narrowing the gap to the best open-weight models of similar scale, such as the Qwen 3-32B-Thinking series models in our suite of reasoning benchmarks, all while being trained on 6x fewer tokens,” Ai2 said in a press release.

The company added that Olmo 3-Instruct outperformed Qwen 2.5, Gemma 3 and Llama 3.1.

Source link

Back to top button