0

我在 .wav 文件中有句子数据库。我想根据单词标准分割存储在波形文件中的句子。这意味着在分段时,句子的每个单词都必须单独存储在输出文件夹中的 .wav 文件中。我已经使用 min_silence_len 和silence_thresh 完成了这项工作。这些参数工作得很好。但问题是句子的每个说话者都有自己的说话流,比如停顿、沉默等,所以我想在有单词时对句子进行分段,而不考虑说话者的暂停和静音参数。我已经使用此代码对句子进行静态分段。但我想要动态分割。

from pydub import AudioSegment
from pydub.silence import split_on_silence
import os

for filename in os.listdir(r'/home/user/Pictures/samples/sentences/sent11'):
    if filename.endswith(".wav"):
        filename = (os.path.join(r'/home/user/Pictures/samples/sentences/sent11', filename))
        sample1 = open(filename,'rb')
        sound_file = AudioSegment.from_wav(sample1)
        audio_chunks = split_on_silence(sound_file, 
            # must be silent for at least half a second
            min_silence_len=220,

            # consider it silent if quieter than -16 dBFS
            silence_thresh=-30
        )
        
        filename1 = filename.replace('/home/user/Pictures/samples/sentences/sent11',"")
        filename1 =filename1.replace(".wav","")
        of = "/home/user/Documents/chunks/sent11/" + filename1+"_{0}.wav"
        for i, chunk in enumerate(audio_chunks):

            out_file = of.format(i)
            print ("exporting", out_file)
            chunk.export(out_file, format="wav")
4

0 回答 0