0

我正在研究一个具有三个嵌套 for 循环的函数,这对于它的预期用途来说太慢了。瓶颈显然是循环部分——几乎 100% 的执行时间都花在了最里面的循环中。
该函数接受一个称为rM输入的 2d 矩阵并返回一个称为 3d 的矩阵ec

rows = size(rM, 1);
cols = size(rM, 2);

%preallocate.
ec = zeros(rows+1, cols, numRiskLevels);
ec(1, :, :) = 100;

for risk = minRisk:stepRisk:maxRisk;
    for c = 1:cols,
        for r = 2:rows+1,
            ec(r, c, risk) = ec(r-1, c, risk) * (1 + risk * rM(r-1, c));
        end
    end
end

任何有关加快 for 循环的帮助将不胜感激......

4

2 回答 2

1

问题是,内部循环是最慢的,而矢量化也几乎是不可能的。因为每次迭代都直接依赖于前一次。

外部两个是可能的:

clc;
rM = rand(50);

rows = size(rM, 1);
cols = size(rM, 2);

minRisk = 1;
stepRisk = 1;
maxRisk = 100;
numRiskLevels = maxRisk/stepRisk;

%preallocate.
ec = zeros(rows+1, cols, numRiskLevels);
ec(1, :, :) = 100;

riskArray = (minRisk:stepRisk:maxRisk)';
tic
for r = 2:rows+1
    tmp = riskArray * rM(r-1, :);
    tmp = permute(tmp, [3 2 1]);
    ec(r, :, :) = ec(r-1, :, :) .* (1 + tmp);
end
toc


%preallocate.
ec2 = zeros(rows+1, cols, numRiskLevels);
ec2(1, :, :) = 100;
tic
for risk = minRisk:stepRisk:maxRisk;
    for c = 1:cols
        for r = 2:rows+1
            ec2(r, c, risk) = ec2(r-1, c, risk) * (1 + risk * rM(r-1, c));
        end
    end
end
toc

all(all(all(ec == ec2)))

但令我惊讶的是,矢量化代码确实更慢。(但也许有人可以改进代码,所以我想我把它留给你。)

于 2013-08-17T12:24:16.267 回答
1

我刚刚尝试对外部循环进行矢量化,实际上注意到速度显着提高。当然,在不知道输入(大小)的情况下很难判断脚本的速度,但我会说这是一个很好的起点:

% Here you can change the input parameters
riskVec = 1:3:120;
rM = rand(50);


%preallocate and calculate non vectorized solution
ec2 = zeros(size(rM,2)+1, size(rM,1), max(riskVec));
ec2(1, :, :) = 100;
tic
for risk = riskVec
    for c = 1:size(rM,2)
        for r = 2:size(rM,1)+1
            ec2(r, c, risk) = ec2(r-1, c, risk) * (1 + risk * rM(r-1, c));
        end
    end
end
t1=toc;

%preallocate and calculate vectorized solution
ec = zeros(size(rM,2)+1, size(rM,1), max(riskVec));
ec(1, :, :) = 100;
tic
for c = 1:size(rM,2)
    for r = 2:size(rM,1)+1
        ec(r, c, riskVec) = ec(r-1, c, riskVec) .* reshape(1 + riskVec * rM(r-1, c),[1 1 length(riskVec)]);
    end
end
t2=toc;

% Check whether the vectorization is done correctly and show the timing results
if ec(:) == ec2(:)
    t1
    t2
end

给定的输出是:

 t1 =

    0.1288


t2 =

    0.0408

因此riskVecrM它的速度大约是非矢量化解决方案的 3 倍。

于 2013-08-20T09:26:47.043 回答