对于带有>>的nxN
矩阵,我注意到 matlab效率不高。例如,我们可以考虑:N
n
sum()
N = 10000000;
T = 30;
c=rand(2,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:);end;toc
> Elapsed time is 1.250268 seconds.
> Elapsed time is 0.567871 seconds.
c=rand(3,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:)+c(3,:);end;toc
> Elapsed time is 1.514810 seconds.
> Elapsed time is 0.821631 seconds.
c=rand(4,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:)+c(3,:)+c(4,:);end;toc
> Elapsed time is 1.519009 seconds.
> Elapsed time is 1.069865 seconds.
在所有情况下,显式求和花费的时间更少,但我们可以看到,随着进一步增加,sum
最终会获胜。n
为什么sum
效率不高?
此外,sum
似乎没有从更多的计算线程中受益。例如,
c=rand(10,N);
maxNumCompThreads(2);
tic;for ii=1:T;d=sum(c);end;toc
maxNumCompThreads(1);
tic;for ii=1:T;d=sum(c);end;toc
> Elapsed time is 2.496837 seconds.
> Elapsed time is 2.450345 seconds.
我想这在某种程度上是有道理的,因为并行化可能仅在n
很大时才起作用。
n
如果仍然很小(例如, ),有没有办法使这种计算受益于多线程n<50
?还是有更好的策略?
非常感谢!