OpenMP on Intel i7

I have a problem with openMP on i7 CPU.

I only used openMP for the parallel 'for' loop. The algorithm was used on several different PCs and worked without problems. We recently tried to run it on an i7 system and we got a problem. The software usually takes a while to start and after a few cycles it reported "not enough memory" and we tried to find a memory leak, but instead we found the software stack size being used was too large - there were many 1Mb threads and not closed. Somehow the threads created by openMP were all stuck on the stack and the memory was filled with them.

Has anyone experienced this behavior? The code is very simple, just 'pragma omp parallel for' with some loop that works well on other PCs.

I am using Microsoft Visual C ++ 9.0 compiler with built-in openMP library.

Thank. Sergei

+2


source to share


4 answers


Thanks for answers. I realized that when OpenMP starts some parallel loop, it opens up multiple threads that don't stop at the end, but are reused in another parallel loop. In the case of i7, they are not reused, but are always created for each parallel loop, so the stable growth of 1 MB is growing.



I also tried to write a very simple application that just uses openMP to parallelize multiple loops and I didn't notice any problem with it on the i7. It looks like there are some conditions in the main software that allow for such a concurrency problem. Trying to find more ...

+2


source


You can try using the Intel Thread Building Blocks (TBB) library, which is very similar to OpenMP and it is really easier to parallelize the for loop as you described - to see if there is a difference.



0


source


It looks like an OS issue, not an application issue. I am assuming that the compiler generates the exact same assembly for the same code. If you have some old hyperthreading processor you can try your code, see if there is the same problem there.

0


source


Since I can't see the code, I'll try to guess ...

To me it looks like a nested loop problem when using #pragma omp for.

If you have nested loops, you need to set the inner loop counter variables as private.

Take a look at this sample:

#pragma omp for private(j)
for(i=0; i<100; i++)
{
    for(j=0; j<10; j++)
    {
       A[i] = A[i] * 2;
    }
}

      

The variable j is set to private to have an instance of it in every thread, not in the same instance for all threads.

Check it out in your code, maybe that's the problem.

And (your compiler should tell you this) don't use break; in your paralyzed cycles. It won't work.

Good luck!

0


source







All Articles