Isibizo esimnandi 2

Premium

Uhlu lwezinhlamvu eziningi ezixhunywe ku-zero-shot

Fast Isivinini
Very Good Ubunjani
Yebo Ukuklonya
5 Izilimi

Ngo Isibizo esimnandi 2

-speech clone, and is designed to be used in

Izici ezibalulekile

Ukulungiswa kwezwi

Uhlu lwezinhlamvu ezisuka kumasekondi angama-3-10 we-audio obhekiswe kuwo ngekhwalithi ephezulu.

Izilimi eziningi

Insiza i-Chinese, isiNgisi, isiJaphani, isiKorea, ne-Cantone nge-cross-language synthesis.

Usizo lokusakaza

Indlela yokusakaza ephansi-latency yezinhlelo zokusebenza zesikhathi sangempela kanye namasistimu axhumanayo.

I-Prosody ejwayelekile

Ukudweba okuthuthukisiwe kwe-prosody kwenza amagama azwakala ngokujwayelekile nge-intonation efanele.

Sebenzisa izimo

Ukwakha okuqukethwe ngezindlela eziningi Ama-assistants omsindo wesikhathi sangempela Ukudubula ulwimi oluphakathi Izisebenziso zomsindo ezizikhethela

Indlela yokusetshenziswa Isibizo esimnandi 2

  1. 1

    Bhala futhi ufune ama-credits amahhala

    Dala i-akhawunti emahhala ye-TextToSpeechAI ukuze uqinisekise ama-credits akho wokuqamba, noma sebenzisa idemo kuqala. Akukho GPU noma isingeniso se-CosyVoice2 esidingayo - konke kusebenza ku-infrastructure yethu.

  2. 2

    Khetha i-CosyVoice2 bese ungeza i-reference clip

    Khetha i-CosyVoice2 njengenjini yakho, bese ufaka isiqophi esihlanzekile semizuzu engu-3-10 sezwi ofuna ukulihlonza. I-CosyVoice2 izokhipha izimo zomsindo zokuhlonza ulwimi oluningi olungenalutho.

  3. 3

    Faka umbhalo wakho nganoma iyiphi ulwimi oluxhasiwe

    Bhala noma chofoza isikripthi sakho ngesi-Chinese, isi-English, isi-Japanese, isi-Korean, noma isi-Cantone. I-CosyVoice2 isekela ukuxubha kwe-cross-language, ngakho-ke umsindo ohlonywe ungakhuluma ulwimi oluhlukile ku-reference clip.

  4. 4

    Dala umsindo

    Chofoza ukwakha bese iCosyVoice2 ihlanganisa ukukhuluma okujwayelekile, okuningi kwezilimi ezihlukahlukene kumazwi ahlobene, ngokuvamile ngaphakathi kwemizuzu emincane yokubhala okuncane. Ukusetshenziswa kwe-premium-level kubiza ama-credits angama-25 ngamagama angama-1,000.

  5. 5

    Layisha phezulu noma sebenzisa i-API

    Layisha ngezansi umsindo oqediwe njenge MP3 noma WAV kusuka kumlando wakho, noma usebenzise ngokuzenzakalela u-CosyVoice2 ukuklonya umsindo ngokulinganisela nge-TextToSpeechAI REST API.

Isibizo esimnandi 2 I-API

Yenza ulwimi ngokuzenzakalela usebenzisa i-TextToSpeechAI REST API.

curl -X POST "https://api.texttospeechai.com/v1/generate/" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "I\u002DCosyVoice2 inikeza ukukhuluma okujwayelekile ngezinhlobo eziningi zesiNgisi ngekhono lokuklonya ukukhuluma.",
    "voice": "en_US-lessac-medium"
  }'

Imibuzo ebuzwa kaningi

CosyVoice2 yisigaba esilandelayo sokubhala-ukukhuluma nohlelo lokuhlanganiswa kwezwi kusuka ku-FunAudioLLM (Alibaba). Ixhasa ukuhlanganiswa kwezwi lokuhlanganiswa kwe-zero-shot kusuka kumasekhondi ambalwa kuphela okuxhumana kwezwi futhi ingadala ukukhuluma okujwayelekile ngesi-Chinese, isi-English, isi-Japanese, isi-Korea, ne-Cantone. Ku-TextToSpeechAI ungasebenzisa i-CosyVoice2 kwi-browser ngaphandle kokufaka izinhlelo zasendaweni.

Yebo, i-CosyVoice2 igcwele i-Apache 2.0 licensed - ikhodi kanye nemodeli isisindo. Lokhu kwenza kube lula ukusebenzisa kumikhiqizo yebhizinisi, okuqukethwe okukhokhelwa, kanye nekhasimende lomsebenzi ngaphandle kwezindleko zelayisense noma izimo ezingabizi.

I-CosyVoice2 isekela izilimi eziyisithupha: isi-Chinese (Mandarin), isi-English, isi-Japanese, isi-Korean, ne-Cantone. Iphatha futhi isizinda se-cross-language, ngakho-ke ungaklonya umsindo kusuka ku-recording elingunye ulwimi bese udala ukukhuluma elinye.

Sinikeza imizuzwana engu-3-10 yokuxhumana okuhlanzekile kwezwi lomsindo ofuna ukuwuthola. I-CosyVoice2 iqoqa izici zomsindo usebenzisa indlela yokulinganisa i-scalar ephelezelwa, bese ikhiqiza ulwimi olusha olukhona ulwimi oluxhasiwe. Akukho hlobo lokuqeqeshwa noma ukulungisa okudingayo.

I-CosyVoice2 iyinye yezinhlobo ezinamandla zokuklonya izilimi eziningi, igcina ubufakazi bokuthi umsindo ukhona noma ngabe ukhiqiza umsindo ngenye ulwimi ngaphezu kwevidiyo ebhekiswe kuyo. Ikhiqiza i-prosody ne-intonation, eyenza ukuthi ilungele ukudlulisa ulwimi oluphakathi kanye ne-localized content.

Yebo. I-CosyVoice2 iyimodeli ehamba ngokushesha futhi ifaka imodi yokusakaza ekhiqiza umsindo nge-latency ephansi, eyenza ilungele abasizamazisi bokukhuluma nabasebenzisi abaxhumana nabo. Ku-TextToSpeechAI izizukulwane ziqedela ngemizuzu embalwa umbhalo omncane.

I-CosyVoice2 idinga cishe i-4-6GB ye-VRAM yemodeli ye-0.5B parameter, ngakho-ke i-GPU ene-6GB noma ngaphezulu ivunyelwe uma i-hosting isebenza. Ku-TextToSpeechAI imodeli isebenza ku-GPU yethu, ngakho-ke awudingi noma yiziphi izinsimbi zakho.

CosyVoice2 iyimodeli yepremium-tier futhi ibiza ama-credits angu-25 ngamagama angu-1,000 wesihloko. I-akhawunti ngayinye entsha ithola ama-credits amahhala, ngakho ungazama ukuphinda uCosyVoice2 ngaphambi kokucabanga nge-plan ekhokhelwa.

Zonke ziyinjini eziphezulu zokuklonya umsindo. I-GPT-SoVITS ivame ukufinyelela ekufanani okuphezulu okungemuva kwezwi elilodwa elihlonishwayo, ngenkathi i-CosyVoice2 inamandla kakhulu ekuklonyweni kwezwi eliningi nelingu-cross-language futhi ifaka indlela yokusakaza ephansi-latency. Khetha i-CosyVoice2 uma ufuna umsindo ohlonywe ngamunye ukukhuluma izilimi eziningi.

Zonke zinikeza ukuklonya kwezwi okusezingeni eliphakeme. I-CosyVoice2 isekela amagama amaningi (ama-5 versus ama-2) futhi ifaka ukusakazwa kokusetshenziswa kwesikhathi sangempela, ngenkathi i-F5-TTS ingase isheshe kancane ngemisebenzi yesiNgisi kuphela. Izinhlelo zesiNgisi eziningi i-CosyVoice2 ivame ukufana kahle.

TextToSpeechAI ikuvumela ukuthi uveze izizukulwane zeCosyVoice2 ezifomethi ezijwayelekile ezifana ne MP3 ne WAV. Ungalanda ifayela ngqo kusuka kwikhasi lakho lembali noma uyithole ngokuzenzakalela nge-TextToSpeechAI API.

Yebo. Ungayihlola iCosyVoice2 ngedemo emahhala kanye ne-credits yakho yokuqala emahhala ku-TextToSpeechAI ngaphandle kokufaka noma yini. Ubhalise nje, ulayishe i-clip encane yokubhekisa, ubhale umbhalo wakho nganoma iyiphi ulwimi oluxhasiwe, futhi ukhiqize.

Technical Specs

  • Generation Speed Fast
  • Output Quality Very Good
  • Voice Cloning Supported
  • Languages 5
  • GPU VRAM 4-6GB
  • Credits/1000 chars 25

Try Isibizo esimnandi 2 Now

Generate your first audio free. No credit card required.

Start Free