0

我开始知道如何使用谷歌 API 修改 texttospeech API 的 python 示例代码 我发现了一个问题,当我在 txt 文件中使用 ssml languaje 将文本传递给 API 时,生成的 mp3 音频改变了字符 'é ' 与句子 'derechos de autor' 和字符 'á' 保持沉默。这只发生在我从文件中读取文本时,如果我在调用它时通过参数将 ssml 语句直接提供给应用程序,则不会发生这种变化。我搜索了这个问题,但我没有找到它,有没有人暗示这里发生了什么?

这是从控制台获取 ssml 文本并创建正确的 mp3 音频文件的函数:

def synthesize_ssml(ssml, output):
    from google.cloud import texttospeech as texttospeech   
    client = texttospeech.TextToSpeechClient()
    input_text = texttospeech.types.SynthesisInput(ssml=ssml)
    voice = texttospeech.types.VoiceSelectionParams(language_code='es-ES')
    audio_config = texttospeech.types.AudioConfig(
        audio_encoding=texttospeech.enums.AudioEncoding.MP3)
    response = client.synthesize_speech(input_text, voice, audio_config)
    with open(output, 'wb') as out:
        out.write(response.audio_content)
        print('Audio content written to file "%s"' % output)

这是从文件中获取 ssml 的函数,相同的文本,产生不同的音频文件:

def synthesize_ssml_file(input, output):
    from google.cloud import texttospeech as texttospeech   
    with open(input,'r') as inp:
        input_text=texttospeech.types.SynthesisInput(ssml=str(inp.read()))
    client = texttospeech.TextToSpeechClient()
    voice = texttospeech.types.VoiceSelectionParams(language_code='es-ES')
    audio_config = texttospeech.types.AudioConfig(
        audio_encoding=texttospeech.enums.AudioEncoding.MP3)
    response = client.synthesize_speech(input_text, voice, audio_config)
    with open(output, 'wb') as out:
        out.write(response.audio_content)
        print('Audio content written to file "%s"' % output)
4

0 回答 0