IBM has announced the launch of its latest mainframe system, the IBM z17, a high-performance computing platform engineered to support artificial intelligence workloads at scale.
Highlights
Built on IBM’s second-generation Telum II processor, the z17 is designed to address over 250 AI use cases, including generative models and AI agents.
The system is part of IBM’s ongoing effort to align long-standing enterprise infrastructure with the evolving demands of AI-driven operations.
Despite perceptions of mainframes as legacy systems, they remain a central part of global enterprise infrastructure.
As of 2024, 71% of Fortune 500 companies continue to rely on mainframes, and the market itself is valued at approximately $5.3 billion, according to Market Research Future. IBM’s introduction of the z17 aims to modernize this established platform with capabilities tailored to the AI era.
The IBM z17 delivers a notable performance boost, capable of executing up to 450 billion inference operations per day—representing a 50% increase over the performance of its predecessor, the z16, which launched in 2022.
The system also emphasizes security, with full encryption and seamless integration across existing enterprise hardware and open-source ecosystems, allowing for flexible AI deployment.
According to Tina Tarquinio, Vice President of Product Management and Design for IBM Z, the z17 has been in development for five years, predating the 2022 surge in interest around generative AI.
Tarquinio noted that extensive customer input—over 2,000 hours of research and interviews with more than 100 enterprise clients—helped shape the design, with common themes emerging around the need for enhanced performance, AI acceleration, and long-term flexibility.
At launch, the z17 will support 48 IBM Spyre AI accelerator chips, with plans to increase capacity to 96 within the first year.
This expansion is intended to accommodate increasingly complex and resource-intensive AI models. IBM has emphasized that the system includes built-in overhead to allow for future AI advancements, including larger local memory footprints and newer processing requirements.
Energy efficiency is a key component of the z17’s design. IBM reports that the system delivers 7.5 times the AI acceleration of the z16 while using 5.5 times less energy compared to similar platforms handling multi-model workloads.
This balance of performance and energy efficiency may appeal to organizations seeking to scale AI operations without proportionally increasing energy consumption or operational costs.
Introduction of the IBM Spyre AI Accelerator
The Spyre AI Accelerator plays a central role in expanding the z17’s AI capabilities. Each Spyre chip features 32 AI-specific cores and supports up to 1TB of memory.
These chips are engineered for handling complex tasks such as generative AI and large language models. Up to eight Spyre cards can be installed in a single I/O drawer, delivering significant computational power with a power consumption of no more than 75W per card.
Integrated DPU for Enhanced Data Processing
The Telum II processor also introduces an integrated Data Processing Unit (DPU) designed for I/O acceleration. This integration enhances performance for data-intensive applications by speeding up complex networking and storage protocols.
The DPU supports a 50% increase in I/O density compared to previous generations, enabling better scalability and more efficient data throughput within the system.
Scalability Through Ensemble AI Architecture
The combined use of Telum II processors and Spyre Accelerators allows the z17 to support ensemble AI approaches, where multiple models are integrated to boost accuracy and resilience.
This architecture ensures the z17 can scale with increasing AI complexity while maintaining robust performance and operational efficiency.
Although IBM has not released pricing information, the z17 will be generally available starting June 8.