CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags
Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by the FunAudioLLM team, it aims to achieve high quality speech through advanced autoregressive transformers and ODE-based diffusion models...