AI

Multiverse Computing pushes its compressed AI models into the mainstream

With the default values ​​of private companies on more than 9.2% – the highest percentage in years – VC firm Lux Capital recently advised companies that rely on AI to meet their computing capacity commitments confirmed in writing. With financial instability rippling through the AI ​​supply chain, Lux warned, a handshake agreement is not enough.

But there is another option: no longer depending on external computer infrastructure at all. Smaller AI models that run directly on the user’s own device – no data center, no cloud provider, no counterparty risk – are becoming good enough to be worth considering. And Multiverse computing raises his hand.

The Spanish startup has so far maintained a lower profile than some of its peers, but as demand for AI efficiency grows, this is changing. After compressing models from major AI labs including OpenAI, Meta, DeepSeek and Mistral AI, it has launched both an app showcasing the capabilities of its compressed models and an API portal – a gateway that allows developers to access and build with those models – making them more widely available.

The CompactifAI appwhich shares its name with Multiverse’s quantum-inspired compression technology, is an AI chat tool along the lines of ChatGPT or Mistral’s Le Chat. Ask a question and the model answers. The difference is that Multiverse has built in Gilda, a model so small that the company says it can run locally and offline.

For end users, this is a taste of AI at the edge, with data that doesn’t leave their devices and doesn’t require a connection. But there is a caveat: their mobile devices must have enough RAM and storage space. If not (and many older iPhones don’t), the app will switch back to cloud-based models via API. The routing between local and cloud processing is handled automatically by a system Multiverse has called Ash Nazg, whose name will ring a bell among Tolkien fans because it references the One Ring inscription in “The Lord of the Rings.” But when the app moves to the cloud, it loses its most important privacy benefit.

See also  Are you balding? There’s an AI for that

These limitations mean that CompactifAI isn’t quite ready for mass adoption by customers, even though that may never have been the goal. According to data from Sensor Tower, that was the case less than 5,000 downloads in the past month.

The real target is corporations. Today Multiverse is launching one self-service API portal which gives developers and enterprises direct access to the compressed models – no AWS Marketplace required.

WAN event

San Francisco, CA
|
October 13-15, 2026

“The CompactifAI API portal 1773910599 gives developers direct access to compressed models with the transparency and control needed to put them into production,” CEO Enrique Lizaso said in a statement.

Real-time usage monitoring is one of the key features of the API, and that’s no coincidence. In addition to the potential benefits of deploying at the edge, lower computing costs are one of the top reasons why companies are considering smaller models as an alternative to large language models (LLMs).

It also helps that small models are less limited than they used to be. Earlier this week, Mistral updated its small model family with the launch of Mistral Small 4which is said to be optimized for general chat, coding, agentic tasks and reasoning simultaneously. The French company also released Forge, a system that allows companies to build custom models, including small ones for which they can choose the tradeoffs that their use cases can best tolerate.

Multiverse’s recent results also suggest that the gap with LLMs is closing. The latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b – an OpenAI model whose underlying code is publicly available. The company claims it is now delivering faster responses at a lower cost than the original from which it is derived, a benefit that is especially important for agentic coding workflows, where AI autonomously completes complex, multi-step programming tasks.

See also  DeepSeek’s R1 reportedly ‘more vulnerable’ to jailbreaking than other AI models

Making models small enough to work on mobile devices and still remain usable is a major challenge. Apple Intelligence has gotten around that problem by combining an on-device model and a cloud model. Multiverse’s CompactifAI app can also route requests to gpt-oss-120b via API, but its main purpose is to show that local models like Gilda and its future replacements have benefits beyond cost savings.

For workers in critical areas, a model that can run locally and without a connection to the cloud offers greater privacy and resiliency. But the greater value lies in the business use cases it can deliver – for example, embedding AI in drones, satellites and other settings where connectivity cannot be taken for granted.

The company already serves more than 100 global clients, including the Bank of Canada, Bosch and Iberdrola, but expanding its customer base could help unlock more financing. After raising a $215 million Series B last year, it now does Rumor has it that a new funding round of €500 million will be raised at a valuation of over €1.5 billion.

Source link

Back to top button