我想在单台机器(而不是集群)上并行化 Octave 中的 for 循环。前段时间我问了一个关于 Octave 并行版本的问题 octave 中的 并行计算


我还发现了另一个关于 SO 的问题,但是我没有找到在 Octave 中并行化循环的好答案: Running parts of a loop in parallel with Octave?

有谁知道我在哪里可以找到在 Octave 中并行运行 for 循环的示例???


3 回答 3


I am computing large number of RGB histograms. I need to use explicit loops to do it. Therefore computation of each histogram takes noticeable time. For this reason running the computations in parallel makes sense. In Octave there is an (experimental) function parcellfun written by Jaroslav Hajek that can be used to do it.

My original loop

histograms = zeros(size(files,2), bins^3);
  % calculate histogram for each image
  for c = 1 : size(files,2)
    I = imread(fullfile(dir, files{c}));
    h = myhistRGB(I, bins);
    histograms(c, :) = h(:); % change to 1D vector

To use parcellfun, I need to refactor the body of my loop into a separate function.

function histogram = loadhistogramp(file)
  I = imread(fullfile('.', file));
  h = myhistRGB(I, 8);
  histogram = h(:); % change to 1D vector

then I can call it like this

histograms = parcellfun(8, @loadhistogramp, files);

I did a small benchmark on my computer. It is 4 physical cores with Intel HyperThreading enabled.

My original code

tic(); histograms2 = loadhistograms('images.txt', 8); toc();
warning: your version of GraphicsMagick limits images to 8 bits per pixel
Elapsed time is 107.515 seconds.

With parcellfun

octave:1> pkg load general; tic(); histograms = loadhistogramsp('images.txt', 8); toc();
parcellfun: 0/178 jobs donewarning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
warning: your version of GraphicsMagick limits images to 8 bits per pixel
parcellfun: 178/178 jobs done
Elapsed time is 29.02 seconds.

(The results from the parallel and serial version were the same (only transposed).

octave:6> sum(sum((histograms'.-histograms2).^2))
ans = 0

When I repeated this several times, the running times were pretty much the same all the time. The parallel version was running around 30 second (+- approx 2s) with both 4, 8 and also 16 subprocesses)

于 2013-11-05T20:55:03.297 回答

八度循环很慢,很慢,很慢,而且你最好用数组操作来表达事物。让我们以在 2d 域上评估一个简单的 trig 函数为例,就像在这个3d octave 图形示例中一样(但计算点的数量更真实,而不是绘图):


x = -2:0.01:2;
y = -2:0.01:2;
[xx,yy] = meshgrid(x,y);
z = sin(xx.^2-yy.^2);

将其转换为 for 循环为我们提供了 forloops.m:

x = -2:0.01:2;
y = -2:0.01:2;
z = zeros(401,401);
for i=1:401
    for j=1:401
        lx = x(i);
        ly = y(j);
        z(i,j) = sin(lx^2 - ly^2);


$ octave --quiet vectorized.m 
Elapsed time is 0.02057 seconds.

$ octave --quiet forloops.m 
Elapsed time is 2.45772 seconds.

因此,如果您正在使用 for 循环,并且您拥有完美的并行性而没有开销,那么您必须将其分解为 119 个处理器,以便与非 for-loop 保持平衡!


几乎所有 octave 的内置函数都已经向量化,因为它们在标量或整个数组上运行得一样好;因此通常很容易将事物转换为数组操作,而不是逐个元素地进行操作。对于那些不那么容易的时候,您通常会看到已经存在的实用功能(例如 meshgrid,它从 2 个向量的笛卡尔积生成二维网格)来帮助您。

于 2012-05-10T22:25:51.630 回答

现在pararrayfun可以在此处找到使用示例: http ://wiki.octave.org/Parallel_package

于 2014-08-21T17:35:43.020 回答