Multimodal
-
AI
Z.ai debuts open source GLM-4.6V, a native tool-calling vision model for multimodal reasoning
Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V seriesa new generation of open-source vision language models (VLMs)…
Read More » -
AI
New training method boosts AI multimodal reasoning with smaller, smarter datasets
Researchers from MiroMind AI and several Chinese universities released this OpenMMReasonera new training framework that improves the capabilities of language…
Read More » -
AI
Google unveils Gemini 3 claiming the lead in math, science, multimodal and agentic AI benchmarks
After more than a month of rumors and feverish speculation — including Polymarket wagering on the release date — Google…
Read More » -
AI
Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini
Baidu Inc., China’s largest search engine company, released a new artificial intelligence model on Monday that its developers claim outperforms…
Read More » -
AI
How Patronus AI’s Judge-Image is Shaping the Future of Multimodal AI Evaluation
Multimodal AI transforms the field of artificial intelligence by combining different types of data, such as text, images, video and…
Read More » -
AI
Inside OpenAI’s o3 and o4‑mini: Unlocking New Possibilities Through Multimodal Reasoning and Integrated Toolsets
On April 16, 2025, OpenAI issued Improved versions of his advanced reasoning models. These new models, called O3 and O4-Mini,…
Read More » -
AI
Beyond Manual Labeling: How ProVision Enhances Multimodal AI with Automated Data Synthesis
Artificial intelligence (AI) has transformed industries, making processes more intelligent, faster and more efficient. The data quality used to train…
Read More » -
AI
SHOW-O: A Single Transformer Uniting Multimodal Understanding and Generation
Significant advancements in large language models (LLMs) have inspired the development of multimodal large language models (MLLMs). Early MLLM efforts,…
Read More »

