Inference
-
AI
AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say
Modal laboratoriesa startup specializing in AI inference infrastructure is talking to venture capital firms about a new round at a…
Read More » -
AI
Microsoft announces powerful new chip for AI inference
Microsoft has announced the launch of its latest chip, the Maia 200, which the company describes as a silicon workhorse…
Read More » -
AI
Sources: Project SGLang spins out as RadixArk with $400M valuation as inference market explodes
A pattern is emerging in the world of AI infrastructure: popular open source tools are transforming into venture-backed startups worth…
Read More » -
AI
Together AI's ATLAS adaptive speculator delivers 400% inference speedup by learning from workloads in real-time
Companies expanding AI implementations are hitting an invisible performance wall. The perpetrator? Static speculators who cannot keep up with the…
Read More » -
AI
The Hidden Costs of AI: Securing Inference in an Age of Attacks
This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from…
Read More » -
AI
Databricks, Noma Tackle CISOs’ AI Inference Nightmare
Become a member of our daily and weekly newsletters for the latest updates and exclusive content about leading AI coverage.…
Read More » -
AI
AI Inference at Scale: Exploring NVIDIA Dynamo’s High-Performance Architecture
As artificial intelligence (AI) technology progresses, the need for efficient and scalable inference solutions has grown rapidly. It is soon…
Read More » -
AI
NTT Unveils Breakthrough AI Inference Chip for Real-Time 4K Video Processing at the Edge
In a big leap for Edge AI processing, NTT Corporation Has announced a groundbreaking AI-Inference chip that can process real-time…
Read More » -
AI
The Best Inference APIs for Open LLMs to Enhance Your AI App
Stel je dit eens voor: je hebt een AI-app gebouwd met een ongelooflijk idee, maar het kost moeite om het…
Read More » -
AI
Microsoft’s Inference Framework Brings 1-Bit Large Language Models to Local Devices
On October 17, 2024, Microsoft has announced BitNet.cppan inference framework designed to output 1-bit quantized large language models (LLMs). BitNet.cpp…
Read More »