我想使用 GMM-UBM 和 Sidekit 进行说话人识别,第一步是使用 FeaturesExtractor 功能从我的音频文件中提取 MFCC 特征。但是,当我查看创建的 .h5 文件时,所有倒谱始终为零。然而,我希望看到倒谱也包含除零以外的数字。
我使用以下代码来提取特征:
audioDir = 'Data'
fileList = os.listdir(audioDir)
for i in range(0,len(fileList)):
fileList[i] = fileList[i].replace(".wav", "")
# feature extraction configuration (read from fileList and save mfcc features in audio_features folder)
extractor = sidekit.FeaturesExtractor(audio_filename_structure=audioDir+"/{}.wav",
feature_filename_structure="./audio_features/{}.h5",
sampling_frequency=44100,
lower_frequency=0,
higher_frequency=20050,
filter_bank="log",
filter_bank_size=32,
window_size=0.01,
shift=0.005,
ceps_number=12,
pre_emphasis=0.97,
save_param=["energy", "cep"],
keep_all_features=True)
# save in audio_features folder
for i in range(0,len(fileList)):
a = './audio_features/' + fileList[i] +'.h5'
try:
os.remove(a)
except OSError:
pass
extractor.save(fileList[i])
感谢您提供的任何帮助。