Imini

Ultra

I-TTS ephathelene nezingxoxo nezwi lokuklonya kanye nesandi esingasho lutho

Medium Isivinini
Excellent Ubunjani
Yebo Ukuklonya
1 Izilimi

Ngo Imini

ing the ability to generate a text-to-speech model that is 100% accurate and accurate to the real-world. Dia is a 1.6B parameter text-to-speech model that is 100% accurate and accurate to the real-world. Dia is a 1.6B parameter text-to-speech model that is 100% accurate and accurate to the real-world. Dia is a 1.6B parameter model

Izici ezibalulekile

Ukukhiqizwa kwebhokisi lenkulumo

Yenza ukuxhumana okujwayelekile okuningi-okusho ngezwi elihlukile nokuthatha isikhala.

Izisindo ezingasho lutho

Engeza [ukuthanda], [ukuthanda], [ubuhlungu], (ubuhlungu) ukuze uveze amagama ajwayelekile.

Ukuklonywa kwezwi

Uhlu lwezinhlamvu ezisuka kumasekondi angama-5-10 wezinhlamvu ezibhekiswe kuzo ukuze ukhulume ngokuzimela.

Ukuxhumana okujwayelekile

Amapharamitha ka-1.6B akhiqiza ukukhuluma okujwayelekile nokulalela okujwayelekile.

Sebenzisa izimo

Uhlelo lokuxoxa nokuxhumana Ukukhishwa kwencwadi yomsindo ngezinhlamvu eziningi Amazwi esithombe somdlalo Ipodcast kanye nokudalwa komxholo

Indlela yokusetshenziswa Imini

  1. 1

    Ubhalise mahhala noma uvule idemo

    Dala i-akhawunti emahhala ye-TextToSpeechAI ukuze ufune ama-credits akho okuqala, noma uvule idemo engabhaliswanga ukuzama ukuxhumana kwe-Dia ngokushesha.

  2. 2

    Khetha i-engine ye-Dia

    Ku-TTS dashboard khetha i-Dia kusuka ku-engine list. I-Dia iyi-dialogue-oriented, ultra-tier model ne-multi-speaker ne-voice-cloning support.

  3. 3

    Bhala iskripti lezingxoxo ngezithonjana

    Yenza ingxoxo yakho usebenzisa [S1] ne [S2] ukuphawula umsindo wesikhulumi ngayinye, bese ufaka izixhumanisi ezingasho lutho ezifana ne [laughs], [sighs], [coughs], noma (gasps) lapho ufuna khona umphumela ojwayelekile.

  4. 4

    Dala umsindo

    Chofoza yenza ukuthumela iskripthi sakho seDia ku-GPUs zethu ezihoxisiwe. I-Dia inikeza umsindo we-dialog ophindwe kabili nge-turn-taking kanye ne-nonverbal tags yakho kwifayela le-audio elilodwa.

  5. 5

    Layisha phansi noma thinta i-API

    Layisha ngezansi umbhalo oqediwe wezingxoxo kufomethi oyikhethile, noma uyisebenzise ngokuzenzakalela ngokushicilela iskripthi esifanayo [S1]/[S2] ku-TextToSpeechAI API nge-akhawunti yakho.

Imini I-API

Yenza ulwimi ngokuzenzakalela usebenzisa i-TextToSpeechAI REST API.

curl -X POST "https://api.texttospeechai.com/v1/generate/" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "[S1] Ngikubonga! Unjani namuhla? [udlala] [S2] Ngisebenza kahle, ngiyabonga ngokubuza!",
    "voice": "en_US-lessac-medium"
  }'

Imibuzo ebuzwa kaningi

I-Dia iyimodeli ye-1.6B parameter dialogue-oriented text-to-speech model evela ku-Nari Labs. Ikhethekile ekukhiqizeni amagama ajwayelekile okukhuluma ngokuxhaswa kwama-speakers amaningi, ama-voices angekho emthethweni, nokuklonya kwezwi.

Yebo, iDia igcwele i-Apache 2.0 ilayisense - ikhodi kanye nesimo sesisindo. Ingasetshenziswa ngokukhululekileyo ezisebenzisweni zebhizinisi.

Ngoku iDia ixhasa isiNgisi kuphela. Imodeli ilungele isiNgisi esijwayelekile sokukhuluma.

I-Dia idinga cishe i-10GB ye-VRAM yemodeli yayo ye-1.6B parameter. I-GPU ene-12GB noma ngaphezulu ikhuthazwa ukusebenza okunethezeka. Ku-TextToSpeechAI konke lokhu kuqhutshwa ku-GPUs ethu ahostelwe, ngakho-ke awudingi noma yiziphi izinsimbi zakho.

Yebo - ukuxhumana kuyindlela iDia eyenziwe ngayo. Ngokuguqula [S1] ne [S2] izimo ku-script yakho, iDia TTS ikhiqiza ukuxhumana okuqhubekayo kwezilimi ezimbili ngezwi elihlukile nesimo esicacile, okunzima ukufinyelela nge-TTS yezilimi ezimbili.

I-prefix ngayinye yomgwaqo weskripthi sakho nge [S1] noma [S2] ukuphawula ukuthi ngubani okhuluma. I-Dia inikeza umsindo ohambisanayo kuthegi ngayinye futhi ishintsha phakathi kwazo njengoba ukuxhumana kuhamba, ngakho [S1] ne [S2] zisebenza njengezinhlamvu ezimbili ezingaphakathi kwengxoxo yakho.

Yebo. I-Dia isekela ukuklonywa kwezwi kusuka kumasekhondi angama-5-10 we-audio efanele, okuvumela ukuthi usebenzise kabusha izwi elithile lomsindo. Ungaxhuma ukuklonywa nama-[S1]/[S2] tags ukuze wonke umbhalo ezingxoxweni uzwakale njengezwi olukuklonyelwe.

I-Dia inikeza [i-laughs], [i-sighs], [i-coughs], kanye (i-gasps) njengezinhlamvu zesiNgisi ezijwayelekile ezifakiwe emlilweni ngaphezu kwegama elikhulumayo. Faka isihloko lapho ufuna khona umphumela - isibonelo "[S1] Lokhu kumnandi [i-laughs]" - ukwenza ukuxhumana kube nomuntu.

U-Dia no-Bark baxhasa amazwana angeke abhalwe, kodwa u-Dia ukhishwe ngenhloso yokuxoxa ngezwi eliningi nge [S1]/[S2] ukushintshana nokuhlanganiswa kwezwi. Khetha u-Dia ukuxoxa ngezwi eliningi nomsebenzi wesimo; u-Bark ulungele kakhulu uma ufuna ukufaka ulwimi olubanzi ku-voice-single narration.

I-Dia iyinjini esezingeni eliphakeme, ngakho-ke ibiza ama-credits angama-50 ngamagama angama-1,000 okukhuluma okukhishwayo. Izinga eliphakeme libonisa imodeli enkulu engu-1.6B kanye ne ~10GB yememori ye-GPU esebenzisayo ukuxhumana okusezingeni eliphezulu.

Yebo. Ama-akhawunti amasha we-TextToSpeechAI afaka ama-credits amahhala, futhi kunedemo ongayisebenzisa ngaphandle kokubhalisa. Okukude ukuletha i-Dia dialogue encane nge-[S1] / [S2] tags ngaphambi kokukhetha i-plan ekhokhelwayo.

Yebo. Uma uthola i-API token kusuka kukhasi lakho le-akhawunti ungathumela izikripthi ze-Dia dialogue - kufaka phakathi [S1]/[S2] izibambo neziphawuli ezifana ne- [laughs] - ku-TextToSpeechAI REST API bese ulanda umsindo otholakele ngokuzenzakalela.

Technical Specs

  • Generation Speed Medium
  • Output Quality Excellent
  • Voice Cloning Supported
  • Languages 1
  • GPU VRAM 10GB
  • Credits/1000 chars 50

Try Imini Now

Generate your first audio free. No credit card required.

Start Free