Baidu has announced the open-source release of its Ernie 4.5 series of LLMs, alongside a suite of development toolkits designed to support both research and commercial AI applications.
Highlights
- Major Open-Source Release: Baidu has open-sourced its Ernie 4.5 series of large language models (LLMs), making them available on Hugging Face and GitHub ahead of schedule.
- Model Variety: The suite includes 10 models: multimodal vision-language models, Mixture-of-Experts (MoE) models for efficiency, and reasoning-focused LLMs—catering to both research and commercial use cases.
- Efficiency-Focused Architecture: Baidu’s MoE models feature 47B total parameters, with only 3B active at inference time—balancing power with cost-effective deployment.
- High-Scale Model: The Ernie-4.5-424B model tops the range with 424 billion parameters, signaling Baidu’s entry into ultra-large model territory.
- Strong Internal Benchmarks: Baidu claims its models outperform comparable offerings from DeepSeek and Alibaba in key areas like reasoning and multimodal tasks—though independent testing is still awaited.
- Developer Tools: The release includes ErnieKit for pretraining, fine-tuning, and optimization, plus FastDeploy for deployment across GPUs, CPUs, FPGAs, and HPC setups.
- Training Innovations: Technical advancements include heterogeneous MoE design, FP8 mixed-precision training, expert parallelism, and memory-efficient scheduling strategies.
- Apache 2.0 Licensing: All models and toolkits are released under the permissive Apache 2.0 license, encouraging both research and commercial adoption without restrictive terms.
The release arrives ahead of the company’s previously stated timeline, making the models and toolkits available on platforms such as Hugging Face and GitHub.
Overview of the Ernie 4.5 Model Suite
The Ernie 4.5 release includes ten distinct model variants, spanning a wide range of use cases and parameter sizes.
- Four multimodal vision-language models, designed for tasks that require combined visual and textual understanding
- Eight Mixture-of-Experts (MoE) models, offering enhanced computational efficiency by activating only a portion of the total parameters during inference
- Two models focused on reasoning and problem-solving tasks
Among these, five models are post-trained, while the others remain in a pre-trained state, giving developers flexibility for fine-tuning and downstream task adaptation.
Architectural Focus
Baidu’s engineering approach emphasizes both scalability and efficiency. Notably, the MoE models feature 47 billion total parameters, with only 3 billion active during any given inference—a design choice aimed at lowering operational costs while maintaining strong performance.
The largest model in this release, the Ernie-4.5-424B, showcases Baidu’s commitment to competing in the upper tier of LLM development, with 424 billion parameters.
All models are built using Baidu’s PaddlePaddle deep learning framework, an ecosystem increasingly positioned as a homegrown alternative to TensorFlow and PyTorch for Chinese developers.
Performance Benchmarks and Internal Testing Results
In internal benchmark testing,
- The Ernie-4.5-300B-A47B-Base model reportedly outperforms DeepSeek-V3-671B-A37B-Base across 22 of 28 standard benchmarks, including reasoning and cross-modal tasks.
- The more compact Ernie-4.5-21B-A3B-Base model is said to exceed Alibaba’s Qwen3-30B-A3B-Base in math and reasoning tasks, despite having roughly 30% fewer parameters.
These performance metrics are based on Baidu’s own evaluations, and independent third-party testing will help further validate these claims.
ErnieKit and FastDeploy
In addition to the models themselves, Baidu has released ErnieKit, a dedicated development toolkit for the Ernie 4.5 series. This toolkit includes support for,
- Pre-training
- Supervised Fine-Tuning (SFT)
- Low-Rank Adaptation (LoRA)
- Direct Preference Optimization (DPO)
- Quantization techniques
For deployment, Baidu is offering FastDeploy, enabling seamless deployment across GPUs, CPUs, and even low-bit FPGA or HPC environments.
The toolchain supports FP8 mixed-precision training and 4-bit/2-bit quantization, streamlining the path from model training to production deployment.
Technical Innovations in Training and Architecture
Baidu’s technical documentation highlights several innovations that contributed to the Ernie 4.5 series:
- Heterogeneous MoE design for efficient multimodal learning
- Intra-node expert parallelism for improved training speed
- Memory-efficient pipeline scheduling
- FP8 mixed-precision training for reduced compute overhead
- Fine-grained recomputation strategies to optimize resource utilization
Open Access Under Apache 2.0 License
All models and associated toolkits are released under the Apache 2.0 license, allowing developers, researchers, and commercial entities to use, modify, and deploy the models without restrictive licensing terms.
This move aligns with Baidu’s broader strategy to engage more openly with the global AI research community, following similar open-sourcing efforts by other leading AI labs.