RADTTS 开源项目使用指南-优快云博客

RADTTS 开源项目使用指南

radttsProvides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.项目地址:https://gitcode.com/gh_mirrors/ra/radtts

项目介绍

RADTTS 是一个基于归一化流（normalizing flow）的文本到语音（TTS）框架，提供了先进的声学保真度和高度鲁棒的音频-转录对齐模块。该项目由 NVIDIA 开发，旨在通过训练和推理脚本以及语音转换配方，支持 RADTTS 和 RADTTS++ 模型的使用。RADTTS 模型能够进行多样化的合成和生成建模，并精细控制低维度的语音属性（如 F0 和能量）。

项目快速启动

环境准备

在开始之前，确保你的环境中已经安装了必要的依赖项，包括 Python 和相关的库。可以通过以下命令安装所需的 Python 包：

pip install -r requirements.txt

训练模型

以下是训练 RADTTS 模型的基本步骤：

训练解码器：

python train.py -c config_ljs_radtts.json -p train_config output_directory=outdir

进一步训练带有持续时间预测器的模型：

python train.py -c config_ljs_radtts.json -p train_config output_directory=outdir_dir train_config warmstart_checkpoint_path=model_path.pt model_config include_modules="decatndur"