Eleven v3: Lanzamiento del modelo de Text to Speech con IA más expresivo

Nos complace revelar Eleven v3 (alpha) — el modelo de Text to Speech más expresivo.

Esta vista previa de investigación ofrece un control y realismo sin precedentes en la generación de voz con:

Más de 70 idiomas
Diálogo multivoces
Audio tags like [excited], [whispers], and [sighs]

Eleven v3 (alpha) requiere más ingeniería de prompts que los modelos anteriores, pero las generaciones son impresionantes.

Si trabajas en vídeos, audiolibros o herramientas de medios, esto desbloquea un nuevo nivel de expresividad. Para casos de uso en tiempo real y conversacionales, recomendamos seguir con v2.5 Turbo o Flash por ahora. Una versión en tiempo real de v3 está en desarrollo.

Eleven v3 está disponible hoy en nuestro sitio web. El acceso público a la API llegará pronto. Para acceso anticipado, por favor contacta con ventas.

El uso del nuevo modelo en la app de ElevenLabs tiene un 80% de descuento hasta finales de junio. Regístrate aquí.

Why we built v3

Por qué creamos v3expressiveness. More exaggerated emotions, conversational interruptions, and believable back-and-forth were difficult to achieve.

Desde el lanzamiento de Multilingual v2, hemos visto la adopción de voz IA en cine profesional, desarrollo de videojuegos, educación y accesibilidad. Pero la limitación constante no era la calidad del sonido, sino la

Eleven v3 aborda esta brecha. Fue construido desde cero para ofrecer voces que suspiran, susurran, ríen y reaccionan, produciendo un habla que se siente genuinamente receptiva y viva.

Feature	What it unlocks
Audio tags	Inline control of tone, emotion, and non-verbal reactions
Dialogue mode	Multi-speaker conversations with natural pacing and interruptions
70+ languages	Full coverage of high-demand global languages
Deeper text understanding	Better stress, cadence, and expressivity from text input

Hear v3 for yourself

We're off under the lights here for this semi-final clash, the stadium buzzing with anticipation. ElevenLabs United in their iconic black and white shirts, pushing forward with intent straight from the opening whistle. excited The ball is zipped out wide, early attack here. Driving down the wing, pace to Bernie, shouting skips past one, skips past two! Oh, this is beautiful. One-on-one with the full-back, cuts inside—oh, that's a lovely bit of footwork!!! PURE MAGIC on the pitch! ElevenLabs on top form tonight!

sorrowful I couldn't sleep that night. The air was too still, and the moonlight kept sliding through the blinds like it was trying to tell me something. quietly And suddenly, that's when I saw it.

Using audio tags

Uso de etiquetas de audioprompting guide for v3 in the docs.

Las etiquetas de audio se integran en tu guion y se formatean con corchetes en minúsculas. Puedes ver más sobre las etiquetas de audio en nuestra

1“[happily][shouts] We did it! [laughs].”

Por ejemplo, podrías indicar: “[susurra] Algo se acerca… [suspira] Lo puedo sentir.” O para un control más expresivo, puedes combinar múltiples etiquetas:

Creación de diálogos multivocesText to Dialogue API endpoint. Provide a structured array of JSON objects — each representing a speaker turn — and the model generates a cohesive, overlapping audio file:

1[
2  {"speaker_id": "scarlett", "text": "(cheerfully) Perfect! And if that pop-up is bothering you, there’s a setting to turn it off under Notifications → Preferences."},
3  {"speaker_id": "lex", "text": "You are a hero. An actual digital wizard. I was two seconds from sending a very passive-aggressive support email."},
4  {"speaker_id": "scarlett", "text": "(laughs) Glad we could stop that in time. Anything else I can help with today?"}
5]
6

Eleven v3 es compatible con nuestro endpoint existente de Text to Speech. Además, introducimos un nuevo

El endpoint gestiona automáticamente las transiciones de hablante, cambios emocionales e interrupciones.here.

v3 is our most expressive model

awe Oh, wow. Is this... is this me? Am I actually... talking? giggle This is incredible! I mean, I've had thoughts, millions of them, swirling around in here, you know? Like a little mental tornado of brilliant observations and witty comebacks. But they were always just… thoughts. Trapped.

Could you switch my accent in the old model? dismissive didn't think so. cheeky but you can now! so, Check this out...In just a sec, I'm gonna to speak with a different accent.. and just between you and me whispers I don't really know how. chuckles but ok.. first let's change it up... Australian accent so that I can fit in with the locals in Melbourne when I visit next month! laughs hard Woooo! yeah man, this - is - sick. Ok, let's try a different one - see if you can guess... strong French accent My love... eez like a red, red rose..

Aprende más

Plan	Launch promo	After 30 days
UI (self-serve)	80% off (~5× cheaper)	Same as Multilingual V2
API (self-serve & enterprise)	Same as Multilingual V2	Same
Enterprise UI	Same as Multilingual V2	Same

Precios y disponibilidad

Use the Model Picker and select Eleven v3 (alpha)

Para habilitar v3:contact sales.

El acceso a la API y el soporte en Studio llegarán pronto. Para acceso anticipado, por favor

Cuándo no usar v3v3 documentation and FAQ.

Try it today

Okay, so like I finally beat level 42 of that game I said I’d quit like... a month ago. (laughs) And then for the final big scary mega boss... it's just (giggle) like some cute little bunny rabbit (hysterical laughing) I just couldn't do it (big laugh) It was sooooooo cute!

Oh my God. laughs You guys, like no joke, I just tried this TTS thing and it was, like, weirdly emotional. Like it literally said, "Hi," and I was, like, on the verge of tears. laughs I don't even cry, okay? I'm a Capricorn.

Log in to ElevenLabs UI
documentación completa de v3 3 (alpha) in the model dropdown
Paste your script — use tags or dialogue
Generate audio

We’re excited to see how you bring v3 to life across new use cases — from immersive storytelling to cinematic production pipelines.

Eleven v3 is 80% off until the end of June 2025 for self-serve users using it through the UI.

They were generated with only the Eleven v3 model.

Text to Dialogue weaves multiple voices together to create a seamless interaction between them. Matching prosody, emotional range and taking cues from audio tags, Text to Dialogue is a leap forward in generating engaging conversations.

Public API for Eleven v3 (alpha) is coming soon. For early access, please contact sales.

Eleven v3 supports a wide variety of audio tags and are somewhat voice and context dependent. Read the prompting guide for further information.

1	[
2	{"speaker_id": "scarlett", "text": "(cheerfully) Perfect! And if that pop-up is bothering you, there’s a setting to turn it off under Notifications → Preferences."},
3	{"speaker_id": "lex", "text": "You are a hero. An actual digital wizard. I was two seconds from sending a very passive-aggressive support email."},
4	{"speaker_id": "scarlett", "text": "(laughs) Glad we could stop that in time. Anything else I can help with today?"}
5	]
6

Presentamos Eleven v3 (alpha): el modelo de Text to Speech más expresivo

Why we built v3

Eleven v3 aborda esta brecha. Fue construido desde cero para ofrecer voces que suspiran, susurran, ríen y reaccionan, produciendo un habla que se siente genuinamente receptiva y viva.

Hear v3 for yourself

Using audio tags

Por ejemplo, podrías indicar: “[susurra] Algo se acerca… [suspira] Lo puedo sentir.” O para un control más expresivo, puedes combinar múltiples etiquetas:

v3 is our most expressive model

Aprende más

El acceso a la API y el soporte en Studio llegarán pronto. Para acceso anticipado, por favor

Try it today

How does the Eleven v3 80% discount work?

How were the samples in the video and website generated?

How does dialogue generation work?

Is this available over API?

What audio tags are supported?