Skip to content

GPT‐SoVITS‐v2‐features (新特性)

RVC-Boss edited this page Aug 7, 2024 · 4 revisions

1-v1v2情况对比 (v2 compared with v1)

语种支持(可互相跨语种合成) GPT训练集时长 SoVITS训练集时长 推理速度 参数量 文本前端 功能
v1(1月发布) 中日英 2k小时 2k小时 baseline 200M baseline baseline
v2 中日英韩粤 2.5k小时 vq encoder2k小时,剩余5k小时 翻倍 不变 中日英逻辑均有增强 新增语速调节,无参考文本模式,更好的混合语种切分
Language Support (Cross-language synthesis) GPT Training Dataset Duration SoVITS Training Dataset Duration Inference Speed Number of Parameters Text Frontend Features
v1 (Released in January) Chinese, Japanese, English 2k hours 2k hours baseline 200M baseline baseline
v2 Chinese, Japanese, English, Korean, Cantonese 2.5k hours vq encoder 2k hours, while the other params 5k hours doubled unchanged Enhanced performance for Chinese, Japanese, and English Added speed control, reference-free mode, better mixed-language slices

2-v2模型新特点 (v2 Model New Features)

(1)SoVITS:对低音质参考音频(尤其是来源于网络的高频严重缺失、听着很闷的音频)合成出来音质更好

SoVITS: Improved synthesis quality for low-quality reference audio (especially audio with severe high-frequency loss and muffled sound from the internet).

(2)加大训练集到5k小时,zero shot性能更好音色更像

Increased Training Dataset: Expanded to 5k hours, enhancing zero-shot performance and making the timbre more similar.

(3)增加2个语种,现在可5语种之间互相跨语种合成(跨语种合成,指训练集、参考音频语种和需要合成的语种不同)

Added Two Languages: Now supports cross-language synthesis among five languages (cross-language synthesis means that the training dataset, reference audio language, and the language to be synthesized can all be different).

(4)更好的文本前端:持续迭代更新。v2中英文加入了多音字优化。

Improved Text Frontend: Continuously updated. For v2, Chinese and English have been optimized for polyphonic characters.

3-如何使用v2

(1)可以直接下载7z包,huggingface

(2)或者从v1环境迁移至v2

or you can use v2 from v1 environment:

1.需要pip安装requirements.txt更新环境

1.pip install -r requirements.txt to update some packages

2.需要克隆github上的最新代码

2.clone the latest codes from github

3.需要从huggingface 下载预训练模型文件放到GPT_SoVITS\pretrained_models\gsv-v2final-pretrained下

3.download v2 pretrained models from huggingface and put them into GPT_SoVITS\pretrained_models\gsv-v2final-pretrained

中文额外需要下载G2PWModel_1.1.zip(下载G2PW模型,解压并重命名为G2PWModel,将其放到GPT_SoVITS\text目录下

Chinese v2 additional: G2PWModel_1.1.zip(Download G2PW models, unzip and rename to G2PWModel, and then place them in GPT_SoVITS\text.