目前,我正在使用 pyannote 进行嵌入的 python 上的扬声器 Diarization。我的嵌入函数如下所示:
import torch
import librosa
from pyannote.core import Segment
def embeddings_(audio_path,resegmented,range):
model_emb = torch.hub.load('pyannote/pyannote-audio', 'emb')
embedding = model_emb({'audio': audio_path})
for window, emb in embedding:
assert isinstance(window, Segment)
assert isinstance(emb, np.ndarray)
y, sr = librosa.load(audio_path)
myDict={}
myDict['audio'] = audio_path
myDict['duration'] = len(y)/sr
data=[]
for i in resegmented:
excerpt = Segment(start=i[0], end=i[0]+range)
emb = model_emb.crop(myDict,excerpt)
data.append(emb.T)
data= np.asarray(data)
return data.reshape(len(data),512)
当我跑
embeddings = embeddings_(audiofile,resegmented,2)
我收到此错误:
ParameterError: Mono data must have shape (samples,). Received shape=(1, 87488721)