python - 如何以 20 秒的间隔迭代音频文件？

Question

我正在尝试使用转录大约 3 分钟长的音频文件SpeechRecognition，但是，它似乎无法转录超过 20 秒的任何内容。这是我正在使用的代码：

r = sr.Recognizer()

audio = FLAC(output_name +'.' + output_format)
audio_length = audio.info.length

file = sr.AudioFile(output_name +'.' + output_format)

with file as source:
    audio = r.record(source, duration = 20)

google = r.recognize_google(audio, language = 'ru-RU' )
print(google)

我怎样才能循环这个，以便它转录 0s - 20s，然后是 20s - 40s 等等，直到音频文件结束？

我希望尽可能避免将文件拆分为 20 秒长的单独文件。

score 2 · Accepted Answer

所以我想通了。我没有足够仔细地阅读 SpeechRecognition 模块的文档，但它们有一个offset参数！

count = 0
for audio_path in audio_files:
     audio = FLAC(audio_list[count] + '.' + output_format) #specify audio file for length calculation
     audio_length = audio.info.length #get length of audio file

     #n.b. mutagen module used for calculating audio length

     number_of_iterations = int(audio_length/20)

    if number_of_iterations == 0:
        number_of_iterations = 1

     file = sr.AudioFile(audio_list[count] + '.' + output_format)


    for i in range(number_of_iterations):
        with file as source:
            audio = r.record(source, offset = i*20, duration = 20)

         google = r.recognize_google(audio, language = 'ru-RU' )
         count = count + 1
         print(google)

python - 如何以 20 秒的间隔迭代音频文件？

1 回答 1

Related

Reference