google-cloud-speech - 为什么我的 python 脚本无法识别音频文件中的语音？

Question

我有以下代码成功识别短（不到 1 分钟）测试音频文件，但识别另一个长音频文件（1.5h）失败。

from google.cloud import speech


def run_quickstart():
    speech_client = speech.Client()
    sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC)
    alternatives = sample.recognize('uk-UA')
    for alternative in alternatives:
        print(u'Transcript: {}'.format(alternative.transcript))

    with open("Output.txt", "w") as text_file:
        for alternative in alternatives:
            text_file.write(alternative.transcript.encode('utf8'))

if __name__ == '__main__':
    run_quickstart()

两个文件都上传到Google Cloud。

第一个： https ://storage.googleapis.com/linear-arena-2109/sample.flac

第二个： https ://storage.googleapis.com/linear-arena-2109/zoom0070.flac

两者都是使用ffmpeg实用程序从 mp3 转换而来的：

ffmpeg -i sample.mp3 -ac 1 sample.flac
ffmpeg -i zoom0070.mp3 -ac 1 zoom0070.flac

第一个文件被成功识别，但第二个文件输出以下错误：

google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.)>)

但我已经uri在我的 python 脚本中使用了参数。怎么了？

更新

@NieDzejkob 帮助理解了错误。因此，long_running_recognize应该使用 method 而不是recognize. 综合long_running_recognize使用示例可以在对应的文档页面找到

score 10 · Accepted Answer

对于任何超过 1 分钟的音频文件，您需要使用异步语音识别，并且文件必须上传到 Google Cloud Storage，以便您可以传入gcs_uri.

此外，您将需要.long_running_recognize在脚本中使用该方法。可以在此处找到 GCP 文档中的示例。

我意识到 OP 想通了，但认为提供答案并对其进行概括会很有用。

google-cloud-speech - 为什么我的 python 脚本无法识别音频文件中的语音？

1 回答 1

Related

Reference