MeloTTS

Standard

Fast multi-accent TTS with natural prosody

Very Fast Speed
Good Quality
No Cloning
6 Languages

About MeloTTS

MeloTTS is a fast, multi-accent text-to-speech model from MyShell AI. It supports multiple languages with authentic accent variations for English (American, British, Indian, Australian). MeloTTS runs at real-time speed on CPU, making it efficient for production deployments.

Key Features

Multi-Accent

Multiple English accents: American, British, Indian, and Australian.

CPU Real-Time

Fast enough for real-time synthesis on CPU without GPU.

6 Languages

Supports English, Spanish, French, Chinese, Japanese, and Korean.

Speed Control

Adjustable speaking speed for fine-tuned output.

Use Cases

Multi-accent voice applications International content localization Real-time voice assistants Audiobook production with accent variety

How to Use MeloTTS

  1. 1

    Sign up free or try the demo

    Create a free TextToSpeechAI account to receive starter credits, or use the no-signup demo on the homepage to test MeloTTS instantly. Free credits are enough to evaluate several MeloTTS accents before you commit.

  2. 2

    Pick a MeloTTS accent and voice

    Open the voice browser and filter to MeloTTS. Choose the accent that fits your audience, such as American, British, Indian, or Australian English, or a native Spanish, French, Chinese, Japanese, or Korean voice.

  3. 3

    Enter your text

    Type or paste the script you want voiced into the text box. MeloTTS handles natural prosody automatically, and you can adjust the speaking speed to fine-tune pacing for your chosen accent.

  4. 4

    Generate the audio

    Click generate and MeloTTS synthesizes your speech in real time. Because it runs efficiently on CPU, results come back quickly even for longer passages, and the job costs 10 credits per 1,000 characters.

  5. 5

    Download or use the API

    Play back the result, then download the audio file in your preferred format from the history page. To automate MeloTTS in your own app, call the TextToSpeechAI REST API at api.texttospeechai.com using your account API token.

MeloTTS API

Generate speech programmatically using the TextToSpeechAI REST API.

curl -X POST "https://api.texttospeechai.com/v1/generate/" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "MeloTTS speaks naturally with authentic accents from around the world.",
    "voice": "en_US-lessac-medium"
  }'

Frequently Asked Questions

MeloTTS is a fast text-to-speech model from MyShell AI that specializes in multi-accent speech synthesis. It supports multiple languages with several accent variations for English, producing natural prosody at real-time speed.

Yes. MeloTTS is released under the MIT license, covering both the code and the model weights. You can use it freely in commercial products without royalties or attribution requirements.

MeloTTS supports American, British, Indian, and Australian English accents. It also includes native voices for Spanish, French, Chinese, Japanese, and Korean, making it well suited to international applications.

MeloTTS covers six languages: English, Spanish, French, Chinese, Japanese, and Korean. The English voices add authentic regional accents on top of the base language, so a single model handles many markets.

Yes. MeloTTS is designed for real-time synthesis and generates speech faster than playback even on CPU. This makes it a strong fit for live voice assistants, chatbots, and streaming applications.

MeloTTS produces good, natural-sounding speech with clear prosody and accurate accents. It prioritizes speed and accent variety over the ultra-high fidelity of slower models like StyleTTS2 or Tortoise, so it is ideal when responsiveness matters most.

No, MeloTTS does not clone voices. It uses a fixed set of preset speakers and accents. For voice cloning on TextToSpeechAI, use F5-TTS, Chatterbox, CosyVoice2, OpenVoice, StyleTTS2, or Tortoise instead.

No GPU is required. MeloTTS runs comfortably on CPU using roughly 500MB of memory and remains real-time. A GPU is optional and only adds extra speed; about 500MB of VRAM is enough if you choose to use one.

MeloTTS is a standard-tier engine on TextToSpeechAI, billed at 10 credits per 1,000 characters. That is the lowest pricing tier, matching other lightweight CPU models like Piper, VITS, and Kokoro.

Both MeloTTS and Kokoro are fast, MIT/Apache-licensed CPU models at the standard credit tier. Choose MeloTTS when you need distinct English accents (American, British, Indian, Australian); choose Kokoro for its broad multilingual voice variety. Both are easy to A/B test on TextToSpeechAI.

MeloTTS excels at accent variety and multilingual coverage, while Piper offers the largest preset voice library. Both are fast and CPU-capable at the standard tier, so pick MeloTTS for accent-specific projects and Piper when you want the widest selection of distinct voices.

Yes. New TextToSpeechAI accounts include free starter credits, and there is a demo you can use without signing up. That is enough to test MeloTTS accents and voices before buying additional credits or subscribing.

Technical Specs

  • Generation Speed Very Fast
  • Output Quality Good
  • Voice Cloning Not Supported
  • Languages 6
  • GPU VRAM CPU OK
  • Credits/1000 chars 10

Try MeloTTS Now

Generate your first audio free. No credit card required.

Start Free