Meta has introduced a new generation of large language models — the Llama 4 series. It includes three main models: Scout, Maverick, and Behemoth. These models are designed to directly compete with offerings from OpenAI and Google, such as ChatGPT and Gemini.
Llama 4 Scout is optimized for environments with limited computing resources and can operate on a single Nvidia H100 GPU. It features an exceptionally large context window of up to 10 million tokens, making it well-suited for tasks like document summarization and code analysis. According to Meta, Scout outperforms models like Google’s Gemma 3 and Mistral 3.1 in several benchmarks.
Llama 4 Maverick requires more advanced hardware and is tailored for tasks involving text generation, user interaction, and functioning as a digital assistant. Meta reports that Maverick competes effectively with models such as GPT-4o and DeepSeek-V3, particularly in logic and coding tasks, while utilizing fewer active parameters thanks to its efficient design.
Llama 4 Behemoth, currently in development, is set to be the largest model in the lineup, with nearly two trillion parameters. It is aimed at solving complex problems in fields like mathematics and natural sciences. Meta claims that Behemoth already delivers performance comparable to some of the most advanced models on the market, including GPT-4.5 and Claude Sonnet 3.7.
All three models are built using a Mixture of Experts (MoE) architecture, meaning that only parts of the neural network are activated for any given request. This design significantly reduces system load and increases processing efficiency.
Additionally, Meta emphasizes that the Llama 4 models are better equipped to handle socially sensitive and politically controversial topics. Addressing this area was a key focus during development, amid global criticism of AI systems for potential political bias.