我在 .wav 文件中有句子数据库。我想根据单词标准分割存储在波形文件中的句子。这意味着在分段时,句子的每个单词都必须单独存储在输出文件夹中的 .wav 文件中。我已经使用 min_silence_len 和silence_thresh 完成了这项工作。这些参数工作得很好。但问题是句子的每个说话者都有自己的说话流,比如停顿、沉默等,所以我想在有单词时对句子进行分段,而不考虑说话者的暂停和静音参数。我已经使用此代码对句子进行静态分段。但我想要动态分割。
from pydub import AudioSegment
from pydub.silence import split_on_silence
import os
for filename in os.listdir(r'/home/user/Pictures/samples/sentences/sent11'):
if filename.endswith(".wav"):
filename = (os.path.join(r'/home/user/Pictures/samples/sentences/sent11', filename))
sample1 = open(filename,'rb')
sound_file = AudioSegment.from_wav(sample1)
audio_chunks = split_on_silence(sound_file,
# must be silent for at least half a second
min_silence_len=220,
# consider it silent if quieter than -16 dBFS
silence_thresh=-30
)
filename1 = filename.replace('/home/user/Pictures/samples/sentences/sent11',"")
filename1 =filename1.replace(".wav","")
of = "/home/user/Documents/chunks/sent11/" + filename1+"_{0}.wav"
for i, chunk in enumerate(audio_chunks):
out_file = of.format(i)
print ("exporting", out_file)
chunk.export(out_file, format="wav")