使用pdist2或 pdist。请注意,来自 Matlab 的 pdist2 很慢......
代码:
X = rand(100, 3);
K = squareform(pdist(X, 'euclidean'));
K = exp(-K.^2);
我会为更一般的情况写这个,你有两个矩阵并且你想找到所有的距离。
(x-y)^2 = x'x - 2x'y + y'y
如果要计算 Gram 矩阵,则需要所有差异组合。
X = rand(100, 3);
Y = rand(50, 3);
A = sum(X .* X, 2);
B = -2 *X * Y';
C = sum(Y .* Y, 2);
K = bsxfun(@plus, A, B);
K = bsxfun(@plus, K, C);
K = exp(-K);
编辑:速度比较
代码
% http://stackoverflow.com/questions/13109826/compute-a-gramm-matrix-in-matlab-without-loops/24407122#24407122
function time_gramm()
% I have a matrix X(10000, 800). I want to compute gramm matrix K(10000,10000), where K(i,j)= exp(-(X(i,:)-X(j,:))^2).
X = rand(100, 800);
%% The straight-forward pdist solution.
tic;
K = squareform(pdist(X, 'euclidean'));
K1 = exp(-K .^2);
t1 = toc;
fprintf('pdist took \t%d seconds\n', t1);
%% The vectorized solution
tic;
A = sum(X .* X, 2);
B = -2 * X * X';
K = bsxfun(@plus, A, B);
K = bsxfun(@plus, K, A');
K2 = exp(-K);
t2 = toc;
fprintf('Vectorized solution took \t%d seconds.\n', t2);
%% The not-so-smart triple-loop solution
tic;
N = size(X, 1);
K3 = zeros(N, N);
for i=1:N
% fprintf('Running outer loop for i= %d\n', i);
for j=1:N
xij = X(i,:) - X(j,:);
xij = norm(xij, 2);
xij = xij ^ 2;
K3(i,j) = -xij;
% d = X(i,:) - X(j,:); % Alternative way, twice as fast but still
% orders of magnitude slower than the other solutions.
% K3(i,j) = exp(-d * d');
end
end
K3 = exp(K3);
t3 = toc;
fprintf('Triple-nested loop took \t%d seconds\n', t3);
%% Assert results are the same...
assert(all(abs(K1(:) - K2(:)) < 1e-6 ));
assert(all(abs(K1(:) - K3(:)) < 1e-6 ));
end
结果
我用 N=100 运行了上面的代码
pdist took 8.600000e-03 seconds
Vectorized solution took 3.916000e-03 seconds.
Triple-nested loop took 2.699330e-01 seconds
请注意,在问题的第 100 个请求大小处,另一个答案 ( O(m^2 n)
) 中建议的代码的性能要慢两个数量级。到那时,我插入了 100k 作为X
矩阵的大小,这比我愿意等待的时间要长得多。
在全尺寸问题 ( X = rand(10000, 800)
) 上的表现是这样的:
pdist took 5.470632e+01 seconds
Vectorized solution took 1.141894e+01 seconds.
评论
矢量化解决方案耗时 11 秒,Matlab 的 pdist 耗时 55 秒,而另一个示例中建议的手动解决方案从未完成。