OpenAI Releases First Open-Weight Models in Years: gpt-oss-120b and gpt-oss-20b

OpenAI has released two new open-weight large language models — gpt-oss-120b and gpt-oss-20b — marking its first major open-source release since GPT-2 over five years ago.

Highlights

First open-weight release in 5+ years: OpenAI publishes gpt-oss-120b and gpt-oss-20b under Apache 2.0, allowing full commercial use.
Scalable across hardware tiers: 120b is optimized for a single H100 GPU, while 20b runs on laptops with 16GB RAM — democratizing access.
Mixture-of-Experts (MoE) architecture: Both models activate a subset of parameters per token for efficient inference and reasoning.
Competitive performance: Outperforms DeepSeek’s R1 and Qwen in some benchmarks, though still behind OpenAI’s own o-series models.
High hallucination rates: The 20b and 120b models exhibit 49–53% hallucination in PersonQA — a tradeoff for openness and accessibility.
Agentic capabilities: Supports tool use, chain-of-thought prompting, structured outputs, and adjustable reasoning — ideal for building autonomous AI agents.
Safety-first release: Underwent evaluations using OpenAI’s Preparedness Framework, showing insufficient risk to restrict deployment.
Available via major platforms: Deployable through Hugging Face, AWS Bedrock, Azure AI Foundry, and SageMaker.
Strategic shift: Signals OpenAI’s response to global competition and a pivot back toward open-source collaboration.
Altman’s new stance: CEO Sam Altman admits OpenAI was “on the wrong side of history” regarding openness — now aiming to empower developers globally.

Both models are now available for download via Hugging Face under the permissive Apache 2.0 license, allowing full commercial use.

Differences and Capabilities

The two models differ in scale, hardware requirements, and intended use.

gpt-oss-120b is a large-scale model designed to run efficiently on a single NVIDIA H100 GPU. It leverages Mixture-of-Experts (MoE) architecture with 5.1B active parameters per token, enabling strong performance across reasoning tasks with efficient inference.
gpt-oss-20b, a smaller variant, is optimized to run on consumer-grade hardware — such as laptops with 16GB RAM — making advanced reasoning AI more accessible to individual developers and smaller teams.

Benchmark Performance and Limitations

OpenAI reports that both models perform competitively with existing open-weight models. For example:

On Codeforces with tools, the 120b and 20b scored 2622 and 2516, respectively — outperforming DeepSeek’s R1 model.
In Humanity’s Last Exam, both models outperformed Qwen and DeepSeek but remained below the performance of OpenAI’s o-series models like o3 and o4-mini.

The models also demonstrate higher hallucination rates compared to OpenAI’s proprietary systems.

On PersonQA, gpt-oss-120b and gpt-oss-20b showed hallucination rates of 49% and 53%, respectively — significantly above the o1 model’s 15% and o4-mini’s 36%.

Architecture and Use in Agentic Workflows

Both models employ Mixture-of-Experts (MoE) design, which activates only a subset of parameters per token to reduce compute overhead.

gpt-oss-120b: ~5.1B active parameters per token
gpt-oss-20b: ~3.6B active parameters per token

The models support advanced reasoning workflows, including,

Chain-of-thought prompting
Tool use, such as code execution and web browsing
Structured output generation
Adjustable reasoning effort

Safety Measures and Responsible Release

Prior to release, OpenAI conducted extensive internal and third-party evaluations under its Preparedness Framework, testing for misuse in high-risk domains such as cybersecurity and biotechnology.

The results indicated that the models do not meet the criteria for “high capability” in dangerous applications, enabling OpenAI to proceed with open-weight deployment while maintaining its safety commitments.

Accessibility and Deployment Options

Beyond open access on Hugging Face, the gpt-oss models are also being integrated into cloud platforms.

AWS Bedrock
Azure AI Foundry
SageMaker

OpenAI claims that gpt-oss-120b is up to 3× more cost-efficient than competitors like Gemini or DeepSeek’s R1 on AWS infrastructure — a potentially significant advantage for enterprises seeking scalable, transparent AI solutions.

The release comes at a time of increasing pressure from global AI labs — particularly in China — that are rapidly advancing open-weight models. DeepSeek’s R1, Alibaba’s Qwen, and Moonshot AI have all made substantial progress, prompting OpenAI to revisit its approach to openness.

CEO Sam Altman acknowledged this shift in direction, stating earlier this year that the company had been “on the wrong side of history” regarding open-source AI.

He emphasized OpenAI’s renewed commitment to its mission of ensuring that AGI benefits all of humanity by enabling broader developer access.

“We are excited for the world to be building on an open AI stack created in the United States, based on democratic values,” Altman said in a recent statement.

What's Hot

Snapdragon 8 Elite 2 Leak Hints at 4 Million+ AnTuTu Score Ahead of Official Launch

Microsoft’s Next Annual Windows 11 (25H2) Update Enters Release Preview Testing

Meta Faces Challenges in $14.3B Collaboration With Scale AI

Microsoft’s Next Annual Windows 11 (25H2) Update Enters Release Preview Testing

Meta Faces Challenges in $14.3B Collaboration With Scale AI

China Launches ‘Darwin Monkey’, a Neuromorphic Supercomputer Modeled on the Brain

Samsung Galaxy S25 Rumours of A New Face in 2025

CapCut Ends Free Cloud Storage, Introduces Paid Plans Starting August 5

Anthropic Quietly Tightens Claude Code Usage Limits, Sparking User Frustration

Meta Faces Challenges in $14.3B Collaboration With Scale AI

Reliance Taps Google and Meta to Build India’s AI Backbone

xAI Launches Grok Code Fast 1, a Lightweight Agentic AI Model for Developers

Microsoft Unveils Its First Homegrown AI Models – MAI-Voice-1 & MAI-1-Preview

Anthropic Blocks Hacker Attempts to Misuse Claude AI for Cybercrime

Most Popular

Samsung Galaxy S25 Rumours of A New Face in 2025

Alleged iPhone 17 Pro Geekbench Scores Hint at Significant A19 Pro Chip Performance Leap

Insightful iQoo Z9 Turbo with New Changes in 2024

Our Picks

Google Tests AI-Powered Age Estimation to Shield Minors Across Its Products in the U.S.

Apple Previews Major Accessibility Upgrades, Explores Brain-Computer Interface Integration

Apple Advances Custom Chip Development for Smart Glasses, Macs, and AI Systems

Subscribe to Updates

What's Hot

OpenAI Releases First Open-Weight Models in Years: gpt-oss-120b and gpt-oss-20b

Highlights

Differences and Capabilities

Benchmark Performance and Limitations

Architecture and Use in Agentic Workflows

Safety Measures and Responsible Release

Accessibility and Deployment Options

Related Posts

Subscribe to Updates