ElevenLabs lands $500M to scale expressive AI audio (Full Transcript)

The firm, valued at $11B, unveils v3 conversational AI for 11Agents, expands 11Creative, and teases next-gen TTS and dubbing toward audio intelligence.

Download Transcript (DOCX)

Speakers

Add new speaker

[00:00:07] Speaker 1: We have raised $500 million at an $11 billion valuation to transform how we interact with technology. The round is led by Sequoia, with A16Z quadrupling down and Iconic tripling down, the best partners to go after this together.

[00:00:21] Speaker 2: We built the first truly human-like voice model. Since then, we've pushed state-of-the-art research across text-to-speech, speech-to-text, sound effects, and music. Today, we're releasing a much more expressive conversational AI model, v3, in 11Agents.

[00:00:35] Speaker 1: Hello and welcome to Meridian Airlines. Hey, I've just missed my flight and I don't know what to do. Oh, I'm so sorry, ma'am. Let me try to help in any way I can. At 11Labs, we combine the best of research with the best of product. With 11Creative, we build a single studio for creators to combine voice, music, and image and video. With 11Agents, we build a platform for enterprises to elevate their customer experiences with conversational agents that talk, type, and take action.

[00:01:08] Speaker 2: On the research side, we aren't stopping. Soon, we'll launch an even better text-to-speech model, we'll launch the next-generation dubbing model, and push towards general audio intelligence.

[00:01:17] Speaker 1: Last year, we established ourselves as the voice of technology. And this year, we are going beyond that, making communication and creation with technology seamless. Thank you to our partners, customers, and community. You are the ones adopting the technology, building on the frontier, and showing the world what is possible.

Summary

A company announces it has raised $500M at an $11B valuation to advance human-like conversational AI and audio technology. Backed by Sequoia, a16z, and ICONIQ, it highlights progress in voice models and broader audio research, and introduces new products: a creator studio combining voice, music, and visuals, and an enterprise platform for conversational agents that can talk, type, and take actions. It also previews upcoming releases in text-to-speech, dubbing, and a push toward general audio intelligence.

Copy

Download

Title

ElevenLabs Raises $500M to Expand Expressive AI Voice and Agents

Copy

Download

Keywords

funding round Remove

Remove

$500 million Remove

Remove

$11 billion valuation Remove

Remove

Sequoia

Remove

a16z

Remove

ICONIQ

Remove

conversational AI Remove

Remove

voice model Remove

Remove

text-to-speech Remove

Remove

speech-to-text Remove

Remove

dubbing

Remove

general audio intelligence Remove

Remove

ElevenLabs Remove

Remove

11Agents

Remove

11Creative Remove

Remove

enterprise customer experience Remove

Remove

Copy

Download

Key Takeaways

The company raised $500M at an $11B valuation led by Sequoia, with a16z and ICONIQ increasing their stakes.
It claims to have built the first truly human-like voice model and continued SOTA work in TTS, STT, sound effects, and music.
It is releasing a more expressive conversational AI model (v3) within 11Agents.
11Creative aims to unify creator workflows across voice, music, images, and video in one studio.
11Agents targets enterprises with conversational agents that can speak, type, and take actions to improve customer experience.
Upcoming roadmap includes a better TTS model, next-gen dubbing, and movement toward general audio intelligence.

Copy

Download

Sentiments

Positive: Upbeat, forward-looking announcement emphasizing major funding, strong investor support, product launches, and ambitious research roadmap.

Copy

Download

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file