python - Google Speech API 速度更快，采样率更高

翻译自：https://stackoverflow.com/questions/42837990 2017-03-16T15:13:20.350

561 次

我正在使用 Google Cloud Speech API Python 库从视频文件中提取文本。在前面的步骤中，视频文件被转换为 flac 音频文件。

sample_rate = 48000 
client = speech.Client()
cmd = "ffmpeg -i {} -vn -ac 1 -ar {} {}".format(mpg_file, sample_rate, flac_file)
subprocess.run(cmd)
with open(flac_file, 'rb') as f:
    audio = client.sample(f.read(), sample_rate=sample_rate, encoding='FLAC')
audio.sync_recognize()

为了减少函数花费的时间sync_recognize()，我设置了sample_rate = 16000. 我的想法是与 Web-API 的通信和音频文件的处理应该更快，因为文件大小更小，要处理的数据量更少，信息密度更低。

使用相同的文件列表对 16kHz 和 48kHz 的采样率重复运行时测量会产生：

16kHz: 26.16s per call
48kHz: 17.68s per call

我期待相反的结果。我的想法错了吗？你对此有什么解释吗？

python - Google Speech API 速度更快，采样率更高

0 回答 0

Related

Reference