对于当前的项目,我必须将准连续值离散化到由一些预定义的分箱分辨率定义的箱中。为此,我编写了一个函数,我希望它非常高效,因为它能够使用 bsxfun 处理标量输入和向量输入。然而,经过一些分析后,我发现我这个更大的项目的几乎所有处理时间都在这个函数中产生,并且在函数中,主要是 bsxfun 部分需要时间,其次是 min-query。长话短说,我正在寻找有关如何在 MATLAB 中更快地解决此任务的建议。旁注:我通常传递带有大约 50k 个元素的向量。
这是代码:
function sampleNo = value2sample(value,bins)
%Make sure both vectors have orientations fitting bsxfun
value = value(:);
bins = bins(:)';
%Recover bin resolution (avoids passing another parameter)
delta = median(diff(bins));
%Calculate distance matrix between all combinations
dist = abs(bsxfun(@minus,value,bins));
%What we really want to know is the minimum distance per row
[minval,ind] = min(dist,[],2);
%Make sure we don't accidentally further process NaNs as 1st bin
ind(isnan(minval))=NaN;
sampleNo = ind;
sampleNo(minval>delta) = NaN;
end