Pocket TTS
StandardUltra-lightweight voice cloning that runs real-time on CPU
About Pocket TTS
Pocket TTS by Kyutai is an ultra-lightweight 100M parameter text-to-speech model that runs in real-time on CPU. Despite its tiny size, it supports voice cloning from just 5 seconds of reference audio. Perfect for edge deployment, mobile applications, and scenarios where GPU resources are limited. Currently supports English and French.
Key Features
Ultra-Lightweight
100M parameters - runs real-time on CPU with minimal resources.
Voice Cloning
Clone any voice from just 5 seconds of reference audio, even on CPU.
Real-Time on CPU
No GPU required. Generates speech at real-time speed on standard hardware.
Edge-Ready
Small enough for mobile devices, Raspberry Pi, and embedded systems.
Use Cases
How to Use Pocket TTS
-
1
Sign up free or try the demo
Create a free TextToSpeechAI account to receive starter credits, or use the on-site demo to hear Pocket TTS before signing up. No GPU or local install is needed.
-
2
Select Pocket TTS and add a voice to clone
Choose Pocket TTS as your engine, then upload a short reference clip of about 5 to 10 seconds to clone that voice. Pocket TTS runs entirely on CPU, so cloning is fast and lightweight.
-
3
Enter your text
Type or paste the English or French text you want spoken. Keep an eye on the character count, since Pocket TTS bills at the standard rate of 10 credits per 1,000 characters.
-
4
Generate the audio
Click generate and Pocket TTS synthesizes your text in the cloned voice at real-time speed. Most clips are ready in seconds because the model is so small and CPU-efficient.
-
5
Download or use the API
Download the finished audio, or automate generation through the TextToSpeechAI REST API at api.texttospeechai.com using your account token. The API exposes the same Pocket TTS cloning and synthesis for your own apps.
Pocket TTS API
Generate speech programmatically using the TextToSpeechAI REST API.
curl -X POST "https://api.texttospeechai.com/v1/generate/" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "Pocket TTS delivers voice cloning that runs in real\u002Dtime, even on CPU.",
"voice": "en_US-lessac-medium"
}'
Frequently Asked Questions
Technical Specs
- Generation Speed Very Fast
- Output Quality Good
- Voice Cloning Supported
- Languages 2
- GPU VRAM CPU OK
- Credits/1000 chars 10