ElevenLabs lands $500M to scale expressive AI audio (Full Transcript)

The firm, valued at $11B, unveils v3 conversational AI for 11Agents, expands 11Creative, and teases next-gen TTS and dubbing toward audio intelligence.
Download Transcript (DOCX)
Speakers
add Add new speaker

[00:00:07] Speaker 1: We have raised $500 million at an $11 billion valuation to transform how we interact with technology. The round is led by Sequoia, with A16Z quadrupling down and Iconic tripling down, the best partners to go after this together.

[00:00:21] Speaker 2: We built the first truly human-like voice model. Since then, we've pushed state-of-the-art research across text-to-speech, speech-to-text, sound effects, and music. Today, we're releasing a much more expressive conversational AI model, v3, in 11Agents.

[00:00:35] Speaker 1: Hello and welcome to Meridian Airlines. Hey, I've just missed my flight and I don't know what to do. Oh, I'm so sorry, ma'am. Let me try to help in any way I can. At 11Labs, we combine the best of research with the best of product. With 11Creative, we build a single studio for creators to combine voice, music, and image and video. With 11Agents, we build a platform for enterprises to elevate their customer experiences with conversational agents that talk, type, and take action.

[00:01:08] Speaker 2: On the research side, we aren't stopping. Soon, we'll launch an even better text-to-speech model, we'll launch the next-generation dubbing model, and push towards general audio intelligence.

[00:01:17] Speaker 1: Last year, we established ourselves as the voice of technology. And this year, we are going beyond that, making communication and creation with technology seamless. Thank you to our partners, customers, and community. You are the ones adopting the technology, building on the frontier, and showing the world what is possible.

ai AI Insights
Arow Summary
A company announces it has raised $500M at an $11B valuation to advance human-like conversational AI and audio technology. Backed by Sequoia, a16z, and ICONIQ, it highlights progress in voice models and broader audio research, and introduces new products: a creator studio combining voice, music, and visuals, and an enterprise platform for conversational agents that can talk, type, and take actions. It also previews upcoming releases in text-to-speech, dubbing, and a push toward general audio intelligence.
Arow Title
ElevenLabs Raises $500M to Expand Expressive AI Voice and Agents
Arow Keywords
funding round Remove
$500 million Remove
$11 billion valuation Remove
Sequoia Remove
a16z Remove
ICONIQ Remove
conversational AI Remove
voice model Remove
text-to-speech Remove
speech-to-text Remove
dubbing Remove
general audio intelligence Remove
ElevenLabs Remove
11Agents Remove
11Creative Remove
enterprise customer experience Remove
Arow Key Takeaways
  • The company raised $500M at an $11B valuation led by Sequoia, with a16z and ICONIQ increasing their stakes.
  • It claims to have built the first truly human-like voice model and continued SOTA work in TTS, STT, sound effects, and music.
  • It is releasing a more expressive conversational AI model (v3) within 11Agents.
  • 11Creative aims to unify creator workflows across voice, music, images, and video in one studio.
  • 11Agents targets enterprises with conversational agents that can speak, type, and take actions to improve customer experience.
  • Upcoming roadmap includes a better TTS model, next-gen dubbing, and movement toward general audio intelligence.
Arow Sentiments
Positive: Upbeat, forward-looking announcement emphasizing major funding, strong investor support, product launches, and ambitious research roadmap.
Arow Enter your query
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript