3

I am working on a project for which I need to compute a lot of inner products in high dimensions. I am aware that we should always try to vectorize operations in matlab, but I am not sure how to do this ...

Lets say we have two matrices, A and B of size N x d where N is the number of inner products to compute in d dimensions.

It is easy to implement using for-loops, but I suspect more efficient ways exist. A for-loop implementation might look like this:

innerprods=zeros(N,1);
for i=1:N
    innerprods(i)=A(i,:)*B(i,:)';
end

Does anyone have ideas how to vectorize this? I guess bsxfun should come into play at some point, but I cannot figure out how to use it for this ...

Thanks in advance!

4

2 回答 2

6

简单的呢:

sum(A.*B,2)

一个简单的测试(R2013a WinVista 32bit Core duo):

N = 8e3;
A = rand(N);
B = rand(N);

tic
out = zeros(N,1);
for ii = 1:N
    out(ii) = A(ii,:)*B(ii,:)';
end
toc

tic
out2 = sum(A.*B,2);
toc

all(out-out2 < 1e5*eps) % note difference in precision

时代

loop     5.6 sec
multsum  0.8 sec

R2013a Win7 64 Xeon E5 的附加测试

Avg time loop:           2.00906 seconds
Avg time multSum:        0.18114 seconds
Avg time bsxfun:         0.18203 seconds
Avg time reshapeMultSum: 0.18088 seconds

主要要点:

  • 在这种情况下循环是非常低效的(预期的);
  • bsxfun()是完全多余的,尽管开销并不大(预期);
  • 由于 MATLAB 引擎对列操作(即首先沿行)的内部“偏好”,重新整形为一列然后返回矩阵也有望提供好处;
  • 为了语法清晰,我仍然推荐 multSum: sum(A.*B,2)

测试套件(100 次试验的平均时间,固定矩阵大小 8e3,结果等于 1e5*eps):

N = 8e3;
A = rand(N);
B = rand(N);

tic
for r = 1:100
    out = zeros(N,1);
    for ii = 1:N
        out(ii) = A(ii,:)*B(ii,:)';
    end
end
sprintf('Avg time loop: %.5f seconds', toc/100)

tic
for r = 1:100; out2 = sum(A.*B,2);                                  end
sprintf('Avg time multSum: %.5f seconds', toc/100)

tic
for r = 1:100; out3 = sum(reshape(bsxfun(@times,A(:),B(:)),N,N),2); end
sprintf('Avg time bsxfun: %.5f seconds', toc/100)

tic
for r = 1:100; out4 = sum(reshape(A(:).*B(:),N,N),2);               end
sprintf('Avg time reshapeMultSum: %.5f seconds', toc/100)
于 2013-06-06T22:50:38.910 回答
2

你的怀疑bsxfun是完全有道理的!您可以使用以下单线bsxfun和良好的旧冒号运算符有效地计算内积:

innerprods=sum(reshape(bsxfun(@times,A(:),B(:)),N,d),2);

有关此处发生的事情的更详细说明:

    bsxfun(@times,A(:),B(:)) --> element-wise product, returns Nd x 1 vector
    reshape( ^^^ , N, d)      --> reshape into N x d matrix
    sum( ^^^ , 2)             --> sum over columns to get N x 1 vector

编辑:我做了一些时间来给出一些关于性能差异的想法。我使用了 R2010b(不要问...)。该图显示了使用问题中的循环和我建议的单线N=500以及不同数量的维度的一些经验结果。使用循环比使用bsxfun. 由于更好的并行化,较新版本的速度差异可能更大。

在此处输入图像描述

于 2013-06-06T22:35:19.533 回答