python - 增加Mozilla tts的说话限制

Question

我是初学者，我下载了模型并尝试研究它。但是每当我将句子转换为语音时，模型会在 35 秒或大约 440 个字符处停止，并给出 max_decoder_steps 警告。我想将一个故事转换为大约 1000 个字符的语音。有没有办法绕过这个限制？

score 1 · Accepted Answer

不，因为模型是在较短的输入上训练的。您可以自己训练模型（非常耗时），也可以将输入拆分为更小的序列，如句子。

score 1 · Accepted Answer

当然可以。我刚刚打开tacotron2.py，查找max_decoder_steps并将值 5000 作为测试，现在它创建了更长的 wav 文件。

score 0 · Accepted Answer

增加“max_decoder_steps”的值。

例如，我使用 Tacotron2 模型。

tts --text "Hello"
 > tts_models/en/ljspeech/tacotron2-DDC is already downloaded.
 > vocoder_models/en/ljspeech/hifigan_v2 is already downloaded.
 > Using model: Tacotron2
 > Models reduction rate r is set to: 1
 > Vocoder Model: hifigan
 > Generator Model: hifigan_generator
 > Discriminator Model: hifigan_discriminator

安装的项目可以在这里找到。Debian 10。

/home/user/.local/lib/python3.7/site-packages/TTS

我们需要一个配置文件。

/home/user/.local/lib/python3.7/site-packages/TTS/tts/configs/tacotron_config.py

改变价值观。

max_decoder_steps: int = 500

至

max_decoder_steps: int = 10000

感谢史蒂夫道尔顿。

在虚拟环境中工作。该tacotron_config.py文件位于文件夹中。

.venv/lib/python3.8/site-packages/TTS/tts/configs/

python - 增加Mozilla tts的说话限制

3 回答 3

Related

Reference