Tortoise TTS
UltraUltra-High Quality Speech with Unmatched Naturalness
About Tortoise TTS
Tortoise TTS is an autoregressive text-to-speech model that prioritizes audio quality above all else. Using a combination of autoregressive transformers and diffusion models, Tortoise generates extremely natural speech that captures subtle nuances of human voice. While slower than other models, Tortoise produces the most natural-sounding TTS output available.
Key Features
Ultra-High Quality
The most natural-sounding TTS output available.
Voice Cloning
Clone voices with exceptional fidelity and nuance.
Natural Prosody
Captures subtle speech patterns and micro-expressions.
Quality Presets
Choose from ultra_fast to high_quality processing.
Emotional Depth
Generates speech with genuine emotional resonance.
Open Source
Apache 2.0 licensed with commercial use rights.
Use Cases
Tortoise TTS Voices
View All 18Tortoise Angie
ENTortoise Deniro
ENTortoise Freeman
ENTortoise Geralt
ENTortoise Halle
ENTortoise Jlaw
ENTortoise Lj
ENTortoise Mol
ENTortoise Myself
ENTortoise Pat
ENTortoise Pat2
ENTortoise Snakes
ENHow to Use Tortoise TTS
-
1
Sign up or try the free demo
Create a free TextToSpeechAI account to get starter credits, or use the homepage demo to try Tortoise without signing in. Tortoise is an Ultra-tier engine (50 credits per 1000 characters), so the free credits are perfect for a first short test.
-
2
Choose Tortoise and optionally add a voice to clone
Select a Tortoise voice from the voice browser. To clone a specific person, upload a reference clip (ideally a few clean 5-10 second samples) and Tortoise will reproduce that voice with high fidelity. Otherwise pick one of the built-in Tortoise voices.
-
3
Enter your text
Type or paste the text you want narrated. Because Tortoise is slow, start with a short passage to confirm the voice and tone before sending a full audiobook chapter or long script.
-
4
Pick a quality preset and generate
Choose a Tortoise quality preset: ultra_fast for quick tests, fast for a good speed/quality balance (recommended default), standard, or high_quality for maximum realism. Then click generate and be patient - Tortoise can take from 30 seconds to several minutes per clip, especially at higher presets.
-
5
Download or use the API
When generation finishes, download your audio as MP3, WAV, or OGG, or fetch it from your history. To automate Tortoise jobs, call the TextToSpeechAI API and pass your chosen quality preset - remember to allow longer timeouts since Tortoise renders slowly.
Tortoise TTS API
Generate speech programmatically using the TextToSpeechAI REST API.
curl -X POST "https://api.texttospeechai.com/v1/generate/" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "Tortoise takes its time, but the results are worth waiting for.",
"voice": "tortoise-angie"
}'
Frequently Asked Questions
Technical Specs
- Generation Speed Very Slow
- Output Quality Exceptional
- Voice Cloning Supported
- Languages 1
- GPU VRAM 12-24GB
- Credits/1000 chars 50