Meta releases Llama 4, a new crop of flagship AI models

Meta has released a new collection of AI modelsLlama 4, in the Lama family – no less on a Saturday.
There are a total of four new models: Llama 4 Scout, Lama 4 Maverick and Llama 4 Behemoth. All were trained on “Large quantities of not -Labeled text, image and video data” to give them “broad visual understanding”, says Meta.
The success of open models of the Chinese AI Lab Deepseek, which perform on the same footing or better than the previous flagship Lama models from Meta, reportedly kicked Lama development in overdrive. Meta would have war rooms to decipher how Deepseek has reduced the costs of running and the use of models such as R1 and V3.
Scout and Maverick are openly available on Llama.com And from the partners of Meta, including the AI DEV platform hugging, while Behemoth is still in training. Meta says that Meta AI, his AI-driven assistant in apps, including WhatsApp, Messenger and Instagram, has been updated to use Llama 4 in 40 countries. Multimodal functions are currently limited to the US in English.
Some developers can have Llama 4’s license.
Users and companies are “domicilized” or with a “main issues” in the EU forbidden to use or distribute the modelsProbably the result of board requirements imposed by the AI and the data privacy laws of the region. (In the past, Meta has described these laws as excessively burdensome.) Moreover, companies with more than 700 million monthly active users, just like with earlier Lama releases, must apply for a special license from Meta, which can grant or deny Meta at its own discretion.
“These Llama 4 models mark the start of a new era for the Lama ecosystem,” Meta wrote in a blog post. “This is just the beginning for the Lama 4 collection.”

Meta says that Llama 4 is the first cohort of models that use a mixture of experts (tired) architecture, which is more computational more efficient for training and answering questions. In principle, tired architectures split data processing tasks into subtakes and then delegate them to smaller, specialized “expert” models.
For example, Maverick has 400 billion total parameters, but only 17 billion active Parameters over 128 ‘experts’. (Parameters are roughly in line with the problem -solving skills of a model.) Scout has 17 billion active parameters, 16 experts and 109 billion total parameters.
According to Meta’s internal tests, Maverick, whose company says is best for “general assistant and chat” cases such as creative writing, larger than models such as OpenAI’s GPT-4O and Google’s Gemini 2.0 about certain coding, reasoning, multilingual, long context and image bankmarks. However, Maverick does not entirely measure more capable recent models such as Google’s Gemini 2.5 Pro, the Claude 3.7-Sonnet from Anthropic and GPT-4.5 from OpenAI.
The strengths of Scout are in tasks such as summarizing documents and reasoning on large code bases. Unique, it has a very large context window: 10 million tokens. (“Tokens” represent pieces of raw text – for example the word “fantastic” split into “fan”, “bag” and “tic”)) In normal English) can absorb scout images and up to millions of words, so that it can be processed and work with extremely long documents.
Scout can be performed on a single NVIDIA H100 GPU, while Maverick requires an NVIDIA H100 DGX system or equivalent, according to the calculations of Meta.
Meta, who is not produced, needs even more meaty hardware. According to the company, Behemoth has 288 billion active parameters, 16 experts and almost two trillion total parameters. Meta’s internal benchmarking has Kolos better than GPT-4.5, Claude 3.7 Sonnet and Gemini 2.0 Pro (but not 2.5 Pro) about various evaluations that measure STEM skills, such as mathematics problem solution.
Note that none of the Llama 4 models is a good “reasoning” model in the style of OpenAi’s O1 and O3-Mini. Reasoning models Check their answers and generally respond to more reliable questions, but therefore takes longer than traditional, “non -re -realing” models to deliver answers.

Interestingly, Meta says that it has adjusted all Lama 4 models to refuse ‘controversial’ questions to answer less often. According to the company, Lama 4 responds to “debated” political and social topics that the previous harvest of Lama models would not do. Moreover, the company says, Llama 4 is “dramatically more balanced” with which it is demanded downright, will not entertain.
‘[Y]You can count on [Lllama 4] To offer useful, factual answers without judgment, “a spokesperson for Meta told WAN.”[W]E’re continues to make Lama more responsibly, so that it answers more questions, can respond to different different points of view […] And does not prefer some views on others. “
Those tweaks come as some allies of the White House accuse ai chatbots of being politically ‘awake’.
Many of the closest confidants of President Donald Trump, including billionaire Elon Musk and Crypto and AI “Tsar” David Sacks, claimed that popular AI Chatbots Censor Conservative views. Sacks has been historically chosen OpenAi’s chatgpt as “programmed to wake up” and unlikely about political subject.
In reality, Bias in AI is a persistent technical problem. Musk’s own AI company, Xai, has difficulty creating a chatbot that does not endorse any political views on others.
Companies, including OpenAI, did not stop this from adjusting their AI models to answer more questions than they would have earlier, in particular questions regarding controversial topics.