0

我想使用 GMM-UBM 和 Sidekit 进行说话人识别,第一步是使用 FeaturesExtractor 功能从我的音频文件中提取 MFCC 特征。但是,当我查看创建的 .h5 文件时,所有倒谱始终为零。然而,我希望看到倒谱也包含除零以外的数字。

我使用以下代码来提取特征:

audioDir = 'Data'
fileList = os.listdir(audioDir)
for i in range(0,len(fileList)):
    fileList[i] = fileList[i].replace(".wav", "")

# feature extraction configuration (read from fileList and save mfcc features in audio_features folder)
extractor = sidekit.FeaturesExtractor(audio_filename_structure=audioDir+"/{}.wav",
                                      feature_filename_structure="./audio_features/{}.h5",
                                      sampling_frequency=44100,     
                                      lower_frequency=0,            
                                      higher_frequency=20050,       
                                      filter_bank="log",            
                                      filter_bank_size=32,          
                                      window_size=0.01,             
                                      shift=0.005,                  
                                      ceps_number=12,               
                                      pre_emphasis=0.97,            
                                      save_param=["energy", "cep"], 
                                      keep_all_features=True)

# save in audio_features folder
for i in range(0,len(fileList)):
    a = './audio_features/' + fileList[i] +'.h5'
    try:
        os.remove(a)
    except OSError:
        pass
    extractor.save(fileList[i])
    

感谢您提供的任何帮助。

4

0 回答 0