0

I have a few nested loops and I put the first one in parallel mode. apar and mpar are structs whose values are modified in the loop and then function breakLogic is called which generates a struct which i store in a pre created vector of those structs. one, two ... have been declared earlier in the function.

I have tried to include ordered and critical to ensure accuracy but i am still getting incorrect results.

#pragma omp parallel for ordered private(appFlip, atur, apar, mpar, i, j, k, l, m, n) shared(rawFlip)
for(i=0; i<oneL; i++)
    {
         initialize mpar
         #pragma omp critical
         apar.one = one[i];
         for(j=0; j<twoL; j++)
         {
             apar.two = two[j];
             for(k=0; k<threeL; k++)
             {
                  apar.three = floor(three[k]*apar.two);
                  appFlip = applyParamSin(rawFlip, apar);
                  for(l=0; l< fourL; l++)
                  {
                      mpar.four = four[l];
                      for(m=0; m<fiveL; m++)
                      {
                          mpar.five = five[m];
                          for(n=0; n<sixL; n++)
                          {
                              mpar.six = add[n];
                              atur = breakLogic(appFlip,  mpar, dt);
                              #pragma omp ordered
                              {
                                  sinResVec[itr] = atur;
                                  itr++;
                              }
                          }
                      }
                  }
                  r0(appFlip);
              }
         }
    }

Or is this code not conducive for parallelism? Are there any tools for g++ which can profile code for parallel processing and indicate potential issues?

This modified code works but gives no performance improvement.

4

1 回答 1

1

您的原始代码可以通过一些修改来并行。

  • 设置为. apar_ 并且应该是线程局部变量,并在进入区域时被初始化;mparfirstprivateaparmparparallel for

  • 删除所有criticalordered子句,包括parallel for指令中的一个。它们没有按您的预期工作;

  • iteri, j, k, l, m,计算n以消除依赖性。

.

iter=(((i*twoL+j)*threeL+k)*fourL+m)*fiveL+n;
sinResVec[itr] = atur;

更新

有关 OpenMP 的更多详细信息,尤其是 和 之间的区别,请参见private此处firstprivate

http://msdn.microsoft.com/en-us/library/tt15eb9t.aspx

于 2013-10-24T13:13:52.890 回答