ElevenLabs has expanded the capabilities of its latest text-to-speech model, Eleven V3, bringing support to a total of 70 languages.
Highlights
- Global Language Support: Eleven V3 now supports 70 languages, covering around 90% of the world’s population.
- New Additions: Languages like Arabic, Bengali, Marathi, Tamil, Swahili, Latvian, and Nepali are now included.
- Expressiveness Upgraded: Inline audio tags introduce whispers, laughter, sighs, and emotional tone modulation.
- Multi-Speaker Dialogue: The model enables realistic conversations with features like interruptions, overlaps, and natural pacing.
- Enterprise-Ready: Ideal for industries like e-learning, customer support, media localization, and audiobooks.
- Voice Cloning for Best Results: Users are encouraged to create language-specific Instant Voice Clones for optimal output.
- No API Yet: Eleven V3 is available only via the ElevenLabs website and apps, with API support potentially coming soon.
- Accent Bias Concerns: Researchers urge for better regional accent handling to ensure linguistic equity and inclusivity.
The update introduces 41 additional languages, enabling the platform to serve approximately 90% of the global population. This expansion reflects a growing focus on linguistic accessibility and natural-sounding AI-generated voices.
Launched on June 8 in alpha, Eleven V3 is positioned by the company as its most expressive model to date, with enhancements that extend beyond language support to include emotional tone, multi-speaker capabilities, and improved conversational realism.
Language Expansion Aims for Global Reach
The newly added languages include both widely spoken ones such as Arabic, Bengali, Marathi, Tamil, Telugu, Malay, and Swahili, as well as regionally significant languages like Assamese, Catalan, Latvian, and Nepali.
To achieve the best voice performance, users are encouraged to create an Instant Voice Clone (IVC) while selecting the desired language. ElevenLabs has also announced that language-specific voices will be gradually added to its Voice Library in the coming weeks.
This move is particularly relevant for industries like media localization, education, accessibility tools, and customer support, where multilingual support is essential.
Improvements in Expressiveness and Realism
In addition to language expansion, Eleven V3 introduces a series of voice expressiveness enhancements. The model now supports inline audio tags, allowing users to include subtle speech elements such as:
- Whispers
- Sighs
- Laughter
- Emotional emphasis (e.g., excitement or frustration)
These features are designed to deliver more natural and emotionally resonant outputs, helping voice content feel less robotic and more conversational.
Multi-Speaker Features and Dialogue Support
The update also brings multi-speaker functionality, enabling realistic dialogue and character-driven interactions. This includes:
- Support for interruptions
- Overlapping speech
- Natural pacing between speakers
These upgrades improve voice experiences for use cases like audiobooks, podcasts, simulations, and interactive narratives, where multiple voices and emotional dynamics are key.
Availability and Platform Access
As of now, Eleven V3 is accessible only through the ElevenLabs website and mobile apps. There is currently no API support, which may limit immediate integration for developers and enterprise solutions. However, the company has hinted that developer-facing tools may be introduced later.
Earlier this year, ElevenLabs also unveiled Agent Transfer, a feature designed for conversational AI environments that allows virtual agents to pass context-rich conversations to each other. While separate from the Eleven V3 update, it demonstrates the company’s broader strategy in AI communications.
Content Localization and Beyond
With support for over 70 languages and improved emotional fidelity, Eleven V3 is being seen as a valuable tool for enterprise-level applications,
- Marketing video localization
- Multilingual e-learning content
- Customer service chatbots and assistants
- Narration and storytelling for global audiences
Industry analysts note that this development can help businesses improve customer engagement, scale content distribution across languages, and enhance voice-driven applications in diverse markets.
Addressing Accent Bias and Inclusivity
While the expansion is a notable milestone, recent academic research has raised concerns about accent bias in synthetic speech. Some regional accents may be less accurately rendered, potentially leading to perceived digital exclusion.
Maintaining voice quality and consistency across all supported languages and dialects remains an ongoing challenge. Developers and researchers are calling for regular evaluation to ensure equal representation and authenticity in AI-generated voices.