0

I am sorry if this has been asked before, I could not locate it. It is a simple question, I am trying to use OpenMP such that the each thread runs all the statements inside the for loop.

Example: Assume having two CPUs, thus, using two threads.

#pragma omp for schedule(dynamic) 
    for(int n=0; n<n_size; ++n) { 
foo1();
foo2();
}

I want Thread[1] to sequentially process foo1() and foo2(), Thread[2] to process another iteration but with foo1() and foo2(), and so on. I have tried to use sections, right after declaring the for statement, but, the program went into loose.

Any help would be appreciated.

Cheers, -Rawi

######################################################

After the comments and the discussion below, I will give a simple program:

// put inside main()
int k;
#pragma omp parallel num_threads(2)
    {
#pragma omp for schedule(dynamic) // or using this: schedule(dynamic); I don't know which one is faster
        for( int n=0; n<4; ++n) {
 // #pragma omp single
            { k=0;
                foo1(k);
                foo2(k);
            }
        }

    }

// main ends here

// foo1 increments k which is passed as a reference, then prints it, then, foo2, increments k. So the upper value should be 2. Here's how they look like:
void foo1(int &n){
    cout<<"calling foo1"<<" k= "<<n<<" T["<<omp_get_thread_num()<<endl;
    ++n;

}

void foo2(int &n){
    cout<<"calling foo2"<<" k= "<<n<<" T["<<omp_get_thread_num()<<endl;
    ++n;
}

Here is the output:

calling foo1 k= calling foo1 k= 0 T[00 T[1
calling foo2 k= 1 T[0
calling foo1 k= 0 T[0
calling foo2 k= 1 T[0

calling foo2 k= 2 T[1
calling foo1 k= 0 T[1
calling foo2 k= 1 T[1

As we see, k was 3 for T[1] at foo2, while it should be 1.

Why I am getting this error? The foo2 depends on the values found by foo1 (in my application I have actual parameters passed to the function).

So, using '#pragma omp single' helped a bit, but, there was a comment that this should not be nested! Here's the output after using '#pragma omp single':

calling foo1 k= 0 T[0
calling foo2 k= 1 T[0
calling foo1 k= 0 T[1
calling foo2 k= 1 T[1

However, there should be 4 more outputs (the odd n values)?

4

1 回答 1

0

简单地不要并行化 for 循环,但仍将其放在并行区域内。

#pragma omp parallel
{
  for(int n=0; n<n_size; ++n)  // every thread will run all iterations
  { 
    foo1();
    foo2();
  }
  // threads are not synchronised here! (no implicit barrier)
}
于 2013-10-01T15:52:19.447 回答