Alibaba DAMO Academy

CosyVoice — Try Alibaba's Open-Source Voice Cloning TTS Online

CosyVoice is an open-source multilingual TTS model from Alibaba DAMO Academy. It supports zero-shot voice cloning, cross-lingual synthesis, and fine-grained emotion control — making it one of the most versatile open-source TTS models available.

Описание

Подзаголовок

Попробовать бесплатно Подробнее

Описание

✓Zero-shot voice cloning
✓Cross-lingual voice transfer
✓Fine-grained emotion and style control
✓Open-source (Apache 2.0)
✓Instruction-based speech generation
✓Natural prosody in Chinese and English

Описание

→Zero-shot cloning experiments
→Cross-lingual dubbing
→Research and development
→Expressive storytelling

Поддержка

10+

Chinese, English, Japanese, Cantonese, Korean & ещё

Описание

Описание	Подробности	Скорость	Описание	Голос	Оплата
Fish Speech (CosyVoice)	★★★★	Fast	10+	✓ Zero-shot	Free tier + from $9/mo
Fish Audio	★★★★★	Ultra-fast	40+	✓	Free tier + from $9/mo
IndexTTS	★★★★★	Medium	10+	✓	Free tier + from $9/mo
ElevenLabs	★★★★★	Fast	32	✓ Paid only	From $5/mo (limited)

Заголовок

What is CosyVoice?

CosyVoice is an open-source multilingual TTS model from Alibaba DAMO Academy. It supports zero-shot voice cloning, cross-lingual synthesis, and instruction-based speech generation.

What makes CosyVoice different from other TTS models?

CosyVoice supports zero-shot voice cloning (clone a voice without fine-tuning) and cross-lingual transfer (speak in a different language while preserving the original voice characteristics).

Is CosyVoice free to use?

Yes. CosyVoice is open-source under Apache 2.0. You can try it for free on Fish Speech without any setup.

How do I try CosyVoice online?

Go to Fish Speech, create a free account, open the workspace, and select CosyVoice as your model. No GPU or API key required.

Заголовок

Модель Голос Описание Попробовать бесплатно