Multimodal
-
AI
SHOW-O: A Single Transformer Uniting Multimodal Understanding and Generation
Significant advancements in large language models (LLMs) have inspired the development of multimodal large language models (MLLMs). Early MLLM efforts,…
Read More » -
AI
EAGLE: Exploring the Design Space for Multimodal Large Language Models with a Mixture of Encoders
The ability to accurately interpret complex visual information is a crucial focus of multimodal large language models (MLLMs). Recent work…
Read More » -
AI
MINT-1T: Scaling Open-Source Multimodal Data by 10x
Training frontier large multimodal models (LMMs) requires large-scale datasets with interleaved sequences of images and text in free form. Although…
Read More »