2

我目前正在做一个项目,我必须并行运行多次重复耗时的 MATLAB 函数。出于这个问题的目的,我们将函数称为myfunc.

myfunc使用 MEX 文件并以每 3 小时一次的随机分段违规告终。我无法诊断分段错误,因为它源自我自己没有编码的专有 API。但是,我知道它发生在 MEX 文件中,而且我也知道它与我可以更改的任何设置都没有确定性相关。

我想解决分段违规问题,理想情况下,我还想继续在 MATLAB 中使用 parfor 函数。我现在的想法是在 parfor 循环中使用 try catch 循环,如下所示:

    %create an output cell to store nreps of output from 'myfunc'
    output = cell(1,nreps) 

    %create a vector to keep track of how many runs finish successfully without the error
    successfulrun = zeros(1,nreps);

    % run myfunc in parallel
    parfor i = 1:nreps
       try
        output{i}
        successfulrun(i) = true
       end
    end

    %rerun experiments that did not end up successfully
    while sum(successulruns) < nreps

      %count number of experiments to rerun and initialize variables to hold those results
      reps_to_rerun = find(successfulruns == 0);
      nreps_to_rerun = sum(reps_to_rerun);
      newoutput = cell(1,nreps_to_rerun);
      newsuccessfulrun = zeros(1,nreps_to_rerun)

      %rerun experiments
      parfor i = 1:nreps_to_rerun
         try
          newoutput{i};  
          newsuccessfulrun = true;  
         end  
      end

     %transfer contents to larger loop
     for i = 1:nreps_to_rerun

        rerun_index =  reps_to_rerun(i);
        successfulrun(rerun_index) = newsuccessfulrun(i)

        if newsuccessfulrun(i)
            output{i} = newoutput{i};
        end 
    end
end

我的问题是:

  1. 即使 MEX 文件中存在分段违规,是否可以继续像这样继续运行更多重复?或者我应该清除内存/重新启动 matlabpool?我假设这不应该是问题,因为分段违规是在 C 中。

  2. 有没有办法“打破” parfor 循环?

4