WeClone:用微信聊天记录和语音训练数字分身

WeClone: training digital doppelgangers with WeChat chats and voices

Comprehensive introduction WeClone is an open source project that uses WeChat chat logs and voice messages, combined with large language models and speech synthesis technology, to allow users to create personalized digital doppelgangers. The project can analyze the user's chat habits to train the model , but also a small number of voice samples to generate realistic sound...
3mos ago
0599
中文基于满血 DeepSeek-R1 蒸馏数据集,支持中文R1蒸馏SFT数据集

Chinese based full-blooded DeepSeek-R1 distillation dataset, supports Chinese R1 distillation SFT dataset

Comprehensive Introduction The Chinese DeepSeek-R1 distillation dataset is an open source Chinese dataset containing 110K pieces of data designed to support machine learning and natural language processing research. The dataset is released by Cong Liu's NLP team. The dataset contains not only mathematical data, but also a large number of general types...
5mos ago
0836