0

我尝试使用带有 Bing ASR 服务的SpeechRecognition 包来使用脚本转录此剪辑的音频

#!/usr/bin/env python3

"""Recognize speech using Microsoft Bing Voice Recognition."""

import speech_recognition as sr

from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "input.wav")

# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
    audio = r.record(source)  # read the entire audio file


# Microsoft Bing Voice Recognition API uses keys which are
# 32-character lowercase hexadecimal strings
BING_KEY = "FOOBAR - insert your key here"
try:
    print("Microsoft Bing Voice Recognition thinks you said:\n\n" +
          r.recognize_bing(audio, key=BING_KEY, language="de-DE"))
except sr.UnknownValueError:
    print("Microsoft Bing Voice Recognition could not understand audio")
except sr.RequestError as e:
    print(("Could not request results from Microsoft Bing Voice Recognition "
           "service; {0}").format(e))

它输出:

Microsoft Bing Voice Recognition thinks you said:

Reaser Was ist haben sie Lust mit dem Kino zu kommen war schon dass ich könnte den Film gar nicht folgen

显然,它正在工作,但它不能转录完整的文件。为什么?我怎样才能让它转录完整的文件?

4

1 回答 1

0

问题是 SpeechRecognition 包使用的是 REST 接口而不是 WebSocket 接口。REST 界面限制为 15 秒的音频。

来源:https ://docs.microsoft.com/de-de/azure/cognitive-services/speech/home

于 2017-10-08T11:18:40.873 回答