Sakana AI Open-Sources AB-MCTS: An Algorithm Enabling Multiple AI Models

Tokyo-based research startup Sakana AI has open-sourced a novel algorithm designed to enable multiple AI models to work together on complex reasoning tasks.

Highlights

AB-MCTS Algorithm: Adaptive Branching Monte Carlo Tree Search allows multiple AI models to collaborate during inference by dynamically choosing which model handles each step based on context and capability.
Intelligent Model Switching: Uses Thompson Sampling to assign specific reasoning tasks to the best-suited model, enabling deeper or broader thinking as needed.
Collaborative Gains: In ARC-AGI-2 benchmarks, model combinations using AB-MCTS outperformed individual models, solving 27.5% of tasks compared to 23% for o4-mini alone.
Open Source Toolkit: Released under Apache 2.0 license, the TreeQuest toolkit includes full AB-MCTS implementation, model adapters, and benchmark scripts on GitHub.
Evolutionary Roots: Builds on Sakana AI’s 2024 work in evolutionary model merging—shifting from model “creation” at training time to model “coordination” at runtime.
Real-Time Efficiency: Enables smaller and mid-sized models to outperform larger ones through division of cognitive labor, boosting both accuracy and computational efficiency.

The method, called Adaptive Branching Monte Carlo Tree Search (AB-MCTS), offers a new approach to collaborative inference by dynamically selecting not only how to reason—deeper or broader—but also which model is best suited for each step of the problem.

AB-MCTS

Unlike traditional ensemble methods that rely on fixed voting mechanisms or average outputs, AB-MCTS selects from a pool of AI models at inference time, directing specific sub-tasks to the most suitable model based on its strengths.

This allows for real-time collaboration between models such as Gemini 2.5 Pro, o4-mini, and DeepSeek-R1, with the goal of enhancing performance, improving decision diversity, and optimizing resource usage.

The algorithm builds on Monte Carlo Tree Search (MCTS), long used in AI planning, by adding two key innovations:

Adaptive Depth and Breadth Reasoning: AB-MCTS chooses whether to “think deeper” (refine current outputs) or “think wider” (explore new possibilities).
Model-Level Selection: A Bayesian sampling strategy (specifically Thompson Sampling) determines which AI model to use at each decision branch, allowing for strategic model switching and task assignment.

Performance on ARC-AGI-2 Benchmark

The algorithm was evaluated using the ARC-AGI-2 benchmark, which tests complex reasoning across a variety of abstract tasks. In one test:

o4-mini alone solved 23% of the tasks.
When combined with Gemini 2.5 Pro and R1-0528 via AB-MCTS, the system solved 27.5%, showcasing the benefits of distributed cognitive load and collaborative inference—even without scaling to massive model sizes.

This performance demonstrates that intelligently combining smaller or mid-sized models can outperform single, larger models in certain scenarios, especially where interpretability, adaptability, and computational efficiency are priorities.

Open-Source Release and Toolkit

Sakana AI has made AB-MCTS fully open source under the Apache 2.0 license, along with its associated tools:

TreeQuest Toolkit: A complete implementation of AB-MCTS and its multi-LLM extension.
Benchmark Scripts: Reproducible code for ARC-AGI-2 experiments.
Model Configuration Files: For integrating different language models into the AB-MCTS framework.

Developers and researchers can access the codebase via Sakana AI’s GitHub repository.

Building on Evolutionary Model Merging

AB-MCTS represents a practical extension of Sakana AI’s earlier work on evolutionary model merging, a technique introduced in 2024 that explored combining model capabilities to create novel behaviors.

While that work focused on training-time integration (“mixing to create”), AB-MCTS brings the concept to inference time (“mixing to use”), allowing dynamic orchestration of models as if they were a team of specialists.

Features at a Glance

1. Real-Time Model Selection
Each reasoning step is assigned to the most appropriate model, optimizing both performance and compute usage.

2. Multi-Directional Search
Supports both refinement and exploration within a flexible search tree structure.

3. High Benchmark Efficiency
Outperforms single-model baselines on ARC-AGI-2, especially in nuanced reasoning tasks.

4. Full Open Source Access
Includes the TreeQuest implementation, model adapters, and full experiment documentation.

5. Foundation for Collective Intelligence
Suggests a paradigm shift from monolithic LLMs to collaborative AI teams working in tandem.

Sakana AI’s approach challenges the idea of “one model to rule them all.” Instead, it proposes a future where different models, each with distinct capabilities, contribute collaboratively—similar to how human teams divide labor based on expertise.

What's Hot

Allianz Life Confirms Major Data Breach: Hackers Access Personal Data of Customers

Samsung’s AI Strategy for Galaxy S26 Could Include OpenAI, Perplexity, and More

Apple Overhauls App Store Age Ratings with New Tiers and Child Safety Enhancements

Samsung’s AI Strategy for Galaxy S26 Could Include OpenAI, Perplexity, and More

Google Tests Opal: An AI-Powered App Builder for the No-Code Generation

Google Launches ‘Web Guide’: AI-Powered Search Tool That Organizes Results by Context

Samsung Galaxy S25 Rumours of A New Face in 2025

CapCut Ends Free Cloud Storage, Introduces Paid Plans Starting August 5

6G technology The Future of Innovation for 2024

Samsung’s AI Strategy for Galaxy S26 Could Include OpenAI, Perplexity, and More

Google Tests Opal: An AI-Powered App Builder for the No-Code Generation

Google Launches ‘Web Guide’: AI-Powered Search Tool That Organizes Results by Context

GitHub Launches Spark: AI App Creation Tool with Built-in Collaboration

Google Rolls Out Personalized AI-Powered Virtual Try-On for Shopping

Most Popular

Samsung Galaxy S25 Rumours of A New Face in 2025

Insightful iQoo Z9 Turbo with New Changes in 2024

Apple A18 Pro Impressive Leap in Performance

Our Picks

Apple Previews Major Accessibility Upgrades, Explores Brain-Computer Interface Integration

Apple Advances Custom Chip Development for Smart Glasses, Macs, and AI Systems

Cloud Veterans Launch ConfigHub to Address Configuration Challenges

Subscribe to Updates

What's Hot

Sakana AI Open-Sources AB-MCTS: An Algorithm Enabling Multiple AI Models

Highlights

AB-MCTS

Performance on ARC-AGI-2 Benchmark

Open-Source Release and Toolkit

Building on Evolutionary Model Merging

Features at a Glance

Related Posts

Subscribe to Updates