Meta Launched Llama 4: Multimodal AI Models with Enhanced Architecture

Meta has announced the launch of Llama 4, its latest suite of open-weight artificial intelligence models.

Highlights

Advanced Multimodal Capabilities: Llama 4 introduces Meta’s first use of the Mixture of Experts (MoE) architecture, enabling enhanced reasoning, improved prompt interpretation, and efficient parameter usage across specialized sub-models.

Three Distinct Models: The release includes Llama 4 Scout for long-context reasoning, Maverick for general-purpose tasks with multilingual and creative strengths, and the upcoming Behemoth model—Meta’s largest—with strong performance in STEM domains.

Strategic Adjustments & Content Moderation: Meta has refined Llama 4 to handle politically sensitive topics more responsively and has imposed stricter licensing terms, particularly in the European Union.

Massive Infrastructure Investment: Llama 4 was trained using over 100,000 Nvidia H100 GPUs, reflecting Meta’s significant expansion in AI infrastructure and a projected $40 billion increase in spending for 2024.

Competitive and Organizational Shifts: The launch of Llama 4 is part of Meta’s broader strategic push to stay competitive against leading AI players, even amid leadership changes such as the upcoming resignation of Joelle Pineau.

The release includes three models—Llama 4 Scout, Maverick, and the still-training Behemoth—each designed to expand the Llama model family with improvements in performance, multimodal capabilities, and handling of complex tasks across a wide range of domains.

The models were unveiled over a weekend, signaling a strategic move to respond quickly to global developments in the AI space.

Trained on large volumes of unlabeled text, images, and videos, the Llama 4 series introduces Meta’s first use of the Mixture of Experts (MoE) architecture.

MoE Architecture Breakdown Diagram

This advanced structure distributes computational workloads across specialized sub-models to optimize performance and efficiency. For instance, Maverick is built with 400 billion total parameters but uses only 17 billion per inference, thanks to 128 expert modules.

Model Overview and Technical Capabilities

Scout: A lightweight model designed for summarization, long-context reasoning, and document analysis. It supports a 10 million-token context window, allowing it to process extensive codebases or texts efficiently, even on a single Nvidia H100 GPU.
Maverick: A general-purpose assistant with strengths in multilingual and creative tasks. It requires a more advanced deployment setup, such as a full Nvidia H100 DGX system.
Behemoth: Still in training, this model is expected to be Meta’s largest and most capable to date, with 288 billion active parameters and nearly two trillion total. Preliminary internal benchmarks suggest strong performance in STEM domains, with competitive results against models like GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro.

Meta’s internal evaluations indicate that while Maverick performs competitively against many current flagship models, it does not consistently surpass the latest releases from competitors such as Google and OpenAI.

Enhancements in Llama 4 focus not only on performance but also on how the models handle politically sensitive or ideologically charged queries.

Shifts in Content Moderation and Model Alignment

In contrast to earlier versions, the Llama 4 models have been adjusted to reduce refusals to engage with contentious topics.

Meta describes these changes as efforts to ensure the models remain more responsive and balanced, aiming to provide factual, neutral responses without avoiding difficult questions.

The company emphasizes that this approach is part of its broader commitment to minimizing perceived ideological bias.

Licensing and Distribution Restrictions

With the launch of Llama 4, Meta has also introduced stricter licensing terms. The models are restricted from use or distribution within the European Union, likely due to ongoing concerns regarding the region’s AI governance and data privacy frameworks.

Additionally, companies with more than 700 million monthly active users must seek a special license from Meta to access the models, with approvals granted at the company’s discretion.

The models are accessible via Llama.com and platforms such as Hugging Face. Meta AI, the company’s assistant integrated into WhatsApp, Messenger, and Instagram, has already incorporated Llama 4 in more than 40 countries.

However, advanced multimodal capabilities, such as image and video comprehension, are currently limited to U.S.-based users and only available in English.

Development Motivations and Competitive Context

Meta’s development timeline for Llama 4 appears to have been influenced by rising global competition—particularly the emergence of DeepSeek, a Chinese AI developer whose models have received attention for their efficiency and capabilities.

In response, Meta is reported to have formed internal “war rooms” to analyze and replicate aspects of DeepSeek’s performance strategies.

Infrastructure and Investment

The training of Llama 4 relied on a record-breaking infrastructure of over 100,000 Nvidia H100 GPUs, highlighting the scale and ambition of Meta’s AI efforts.

In 2024 alone, the company’s infrastructure spending is projected to reach $40 billion, representing a 42% increase from the previous year. This investment reflects Meta’s long-term commitment to AI leadership and large-scale model development.

Organizational Changes

Amid these developments, Joelle Pineau, head of Meta’s AI research division, has announced her resignation effective May 30, 2025.

Pineau played a key role in the development of foundational tools like the Llama model family, and her departure marks a notable leadership transition during a period of rapid technological advancement for the company.

Model Limitations

Despite its strengths, Llama 4 does not yet include OpenAI-style “reasoning layers,” which are designed to enhance factual accuracy and answer reliability.

Nonetheless, improvements in responsiveness, scalability, and context handling suggest that Meta is positioning the Llama series for increasingly dynamic, real-time interactions.

What's Hot

Snapdragon 8 Elite 2 Leak Hints at 4 Million+ AnTuTu Score Ahead of Official Launch

Microsoft’s Next Annual Windows 11 (25H2) Update Enters Release Preview Testing

Meta Faces Challenges in $14.3B Collaboration With Scale AI

Microsoft’s Next Annual Windows 11 (25H2) Update Enters Release Preview Testing

Meta Faces Challenges in $14.3B Collaboration With Scale AI

China Launches ‘Darwin Monkey’, a Neuromorphic Supercomputer Modeled on the Brain

Microsoft Launches Copilot Shopping with Built-in Checkout and Price Tracking

CapCut Ends Free Cloud Storage, Introduces Paid Plans Starting August 5

Samsung Galaxy S25 Rumours of A New Face in 2025

Meta Faces Challenges in $14.3B Collaboration With Scale AI

Reliance Taps Google and Meta to Build India’s AI Backbone

xAI Launches Grok Code Fast 1, a Lightweight Agentic AI Model for Developers

Microsoft Unveils Its First Homegrown AI Models – MAI-Voice-1 & MAI-1-Preview

Anthropic Blocks Hacker Attempts to Misuse Claude AI for Cybercrime

Most Popular

Samsung Galaxy S25 Rumours of A New Face in 2025

Alleged iPhone 17 Pro Geekbench Scores Hint at Significant A19 Pro Chip Performance Leap

Insightful iQoo Z9 Turbo with New Changes in 2024

Our Picks

Google Tests AI-Powered Age Estimation to Shield Minors Across Its Products in the U.S.

Apple Previews Major Accessibility Upgrades, Explores Brain-Computer Interface Integration

Apple Advances Custom Chip Development for Smart Glasses, Macs, and AI Systems

Subscribe to Updates

What's Hot

Meta Launched Llama 4: Multimodal AI Models with Enhanced Architecture

Highlights

Model Overview and Technical Capabilities

Shifts in Content Moderation and Model Alignment

Licensing and Distribution Restrictions

Development Motivations and Competitive Context

Infrastructure and Investment

Organizational Changes

Model Limitations

Related Posts

Subscribe to Updates