嗯,我已经想通了。可以这样做:
clusterCount = 1024;
datasetTrain = single(rand(128, 100000));
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 1 - cluster train data and get train assignments
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[clusterCenters, trainAssignments_actual] = vl_kmeans(datasetTrain, clusterCount, ...
'Algorithm', 'ANN', ...
'Distance', 'l2', ...
'NumRepetitions', 1, ...
'NumTrees', 3, ...
'MaxNumComparisons', ceil(clusterCount / 50) ...
);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 2 - assign train data to clusters centers
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
forest = vl_kdtreebuild(clusterCenters, ...
'Distance', 'l2', ...
'NumTrees', 3 ...
);
trainAssignments_expected = vl_kdtreequery(forest, clusterCenters, datasetTrain);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 3 - validate second assignment
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
validation = isequal(trainAssignments_actual, trainAssignments_expected);
在第 2 步中,我正在使用集群中心创建一个新树,然后再次将数据分配给中心。它给出了一个有效的结果。