Multimodal

AI

adminDecember 9, 2025
7

Z.ai debuts open source GLM-4.6V, a native tool-calling vision model for multimodal reasoning

Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V seriesa new generation of open-source vision language models (VLMs)…
Read More »
AI

adminDecember 3, 2025
6

New training method boosts AI multimodal reasoning with smaller, smarter datasets

Researchers from MiroMind AI and several Chinese universities released this OpenMMReasonera new training framework that improves the capabilities of language…
Read More »
AI

adminNovember 19, 2025
3

Google unveils Gemini 3 claiming the lead in math, science, multimodal and agentic AI benchmarks

After more than a month of rumors and feverish speculation — including Polymarket wagering on the release date — Google…
Read More »
AI

adminNovember 12, 2025
3

Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini

Baidu Inc., China’s largest search engine company, released a new artificial intelligence model on Monday that its developers claim outperforms…
Read More »
AI

adminApril 29, 2025
0

How Patronus AI’s Judge-Image is Shaping the Future of Multimodal AI Evaluation

Multimodal AI transforms the field of artificial intelligence by combining different types of data, such as text, images, video and…
Read More »
AI

adminApril 21, 2025
6

Inside OpenAI’s o3 and o4‑mini: Unlocking New Possibilities Through Multimodal Reasoning and Integrated Toolsets

On April 16, 2025, OpenAI issued Improved versions of his advanced reasoning models. These new models, called O3 and O4-Mini,…
Read More »
AI

adminMarch 16, 2025
0

Meta AI’s MILS: A Game-Changer for Zero-Shot Multimodal AI

For years, artificial intelligence (AI) has made impressive developments, but it has always had a fundamental disability in the inability…
Read More »
AI

adminFebruary 18, 2025
2

Beyond Manual Labeling: How ProVision Enhances Multimodal AI with Automated Data Synthesis

Artificial intelligence (AI) has transformed industries, making processes more intelligent, faster and more efficient. The data quality used to train…
Read More »
AI

adminOctober 12, 2024
5

SHOW-O: A Single Transformer Uniting Multimodal Understanding and Generation

Significant advancements in large language models (LLMs) have inspired the development of multimodal large language models (MLLMs). Early MLLM efforts,…
Read More »
AI

adminSeptember 27, 2024
0

Meta’s Llama 3.2: Redefining Open-Source Generative AI with On-Device and Multimodal Capabilities

Meta’s recent launch of Llama 3.2the latest iteration in the Llama family of large language models, is a significant development…
Read More »