Dia
UltraDialogue-oriented TTS with voice cloning and nonverbal sounds
About Dia
Dia by Nari Labs is a 1.6B parameter dialogue-focused text-to-speech model. It excels at generating natural conversational speech with support for nonverbal sounds like laughter, sighs, and coughs. Dia supports multi-speaker dialogue generation and voice cloning from 5-10 seconds of reference audio, making it ideal for creating realistic conversations and character voices.
Key Features
Dialogue Generation
Generate natural multi-speaker conversations with distinct voices and turn-taking.
Nonverbal Sounds
Add [laughs], [sighs], [coughs], (gasps) for natural paralinguistic expression.
Voice Cloning
Clone any voice from 5-10 seconds of reference audio for personalized speech.
Natural Conversation
1.6B parameters produce highly natural conversational prosody and intonation.
Use Cases
How to Use Dia
-
1
Sign up free or open the demo
Create a free TextToSpeechAI account to claim your starter credits, or open the no-signup demo to try Dia dialogue right away.
-
2
Select the Dia engine
In the TTS dashboard choose Dia from the engine list. Dia is the dialogue-oriented, ultra-tier model with multi-speaker and voice-cloning support.
-
3
Write a dialogue script with tags
Compose your conversation using [S1] and [S2] to mark each speaker turn, and drop in nonverbal tags such as [laughs], [sighs], [coughs], or (gasps) where you want natural reactions.
-
4
Generate the audio
Click generate to send your Dia script to our hosted GPUs. Dia renders the two-speaker dialogue with turn-taking and your nonverbal tags into a single audio file.
-
5
Download or call the API
Download the finished dialogue in your chosen format, or automate it by posting the same [S1]/[S2] script to the TextToSpeechAI API with your account token.
Dia API
Generate speech programmatically using the TextToSpeechAI REST API.
curl -X POST "https://api.texttospeechai.com/v1/generate/" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "[S1] Hello there! How are you today? [laughs] [S2] I am doing great, thanks for asking!",
"voice": "en_US-lessac-medium"
}'
Frequently Asked Questions
Technical Specs
- Generation Speed Medium
- Output Quality Excellent
- Voice Cloning Supported
- Languages 1
- GPU VRAM 10GB
- Credits/1000 chars 50