python - Google Cloud Speech-To-Text API 响应不返回字词

Question

我正在尝试使用带有 Python 的 Google Cloud Speech-To-Text API 在我的应用程序中实现 Speech-To-Text。我得到了正确的转录，但是响应只包含转录和信心，而不是单独的单词。如果我尝试访问这些单词，我会得到一个空列表。

为了访问结果，我使用以下代码：

best_alternative = result.alternatives[0]
word = best_alternative
transcript = best_alternative.transcript
confidence = best_alternative.confidence
print(f'Transcript: {transcript}')
print(f'Confidence: {confidence:.0%}')

打印出来best_alternative.__dict__给了我成绩单和信心，但不是文字。有什么特殊的方法可以访问成绩单中的单词还是我遗漏了什么？

更新：最初，我正在初始化识别配置，如下所示：

config = speech.RecognitionConfig(
    encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=RATE,
    language_code=lan_code)
streaming_config = speech.StreamingRecognitionConfig(
        config=config,
        interim_results=True,
        enable_speaker_diarization=True)

使用此配置，返回的响应不包含文字，仅包含文字记录和置信度。然后我将配置更改为：

config = speech.RecognitionConfig()
config.sample_rate_hertz = 16000
config.language_code = 'en-US'
config.encoding = speech.RecognitionConfig.AudioEncoding.LINEAR16
config.enable_speaker_diarization = True

这最终给了我文字以及成绩单和信心。可以使用以下方式访问这些词：

response.results[0].alternatives[0].words[i].word

score 0 · Accepted Answer

根据 Cloud Speech-to-Text API REST文档，该方法为每个转录结果对象返回SpeechRecognitionResultspeech.recognize的语音识别响应，而SpeechRecognitionAlternative在特定假设内检索, 。results[]transcriptconfidencewords[]

通过 Python Google google-cloud-speech库实现，我承认对于真正的SpeechRecognitionAlternative() 类，我们可以发现每个识别单词的单词特定信息WordInfo的列表。

print("Words: {}".format(result.alternatives[0].words[0].word))

python - Google Cloud Speech-To-Text API 响应不返回字词

1 回答 1

Related

Reference