OpenMP uderstanding deadlock in critical design

I am trying to understand why a dead end occurs when a critical construct is nested within a critical construct in a parallel region.

I looked at the following resources, this source, which the author writes:

In OpenMP, this can happen if a function is called inside a critical region that contains another critical region. In this case, the critical area of ​​the called function will wait for the first critical area to complete - which will never happen.

Okay, but why not? Also Hager, Georg and Gerhard Wellin. An introduction to high performance computing for scientists and engineers. CRC Press, 2010, p. 149:

When a thread encounters a CRITICAL directive inside a critical region, it will block forever.

The same question, why?

Finally, Chapman, Barbara, Gabriele Jost and Ruud Van Der Paz. Using parallel programming OpenMP: portable shared memory. Volume 10. MIT press, 2008 also provides an example of the use of locks, but not with a critical design.

From my current understanding, there are two different possible ways to eliminate deadlock in a nested critical region:

Start over:

If two threads arrive at a nested critical construction (one critical region inside the other), one thread enters the "outer" critical region, and the other waits. Cited by Chapman et al.

When a thread encounters a critical construct, it waits until no other thread is executing a critical region of the same name.

Ok, so far so good. Now thread alone does NOT enter the nested critical region, because this is the synchronization point at which threads wait until all other threads appear before continuing. And since the second thread is waiting for the first thread, the exit from the "outer" critical area they are at a dead end.

Finish over again.

Start the second lesson:

Both streams arrive at an "external" critical construction. One thread enters the "external" critical structure, the second stage is waiting. Now thread one enters the "internal" critical construction and stops at this implied barrier, because the second thread is waiting. Thread two, on the other hand, waits for thread one to go to the "outer" thread, and therefore both wait forever.

End of the second.

Here's a little Fortran code creating a dead end:

  1   subroutine foo
  2 
  3     !$OMP PARALLEL 
  4     !$OMP CRITICAL 
  5       print*, 'Hallo i am just a single thread and I like it that way'
  6     !$OMP END CRITICAL
  7     !$OMP END PARALLEL 
  8 
  9   end subroutine foo
 10 
 11 program deadlock
 12   implicit none
 13   integer :: i,sum = 0
 14 
 15   !$OMP PARALLEL
 16   !$OMP DO 
 17   do i = 1, 100
 18   !$OMP CRITICAL
 19      sum = sum + i
 20      call foo()
 21   !$OMP END CRITICAL
 22   enddo
 23   !$OMP END DO
 24   !$OMP END PARALLEL
 25 
 26   print*, sum
 27 end program deadlock

      

So my question is, is one of the two correct sentences, or is there another possibility why a deadlock occurs in this situation.

+3


source to share


1 answer


There is no implicit barrier, that is, there is no "synchronization point when threads are waiting for other threads to appear" associated with CRITICAL constructs. Instead, at the beginning of a critical construct, threads expect any thread already inside a critical construct with the same name to leave the construct.

Critical constructs with the same name cannot be nested, since the current OpemMP rules say they cannot (see the restrictions on nesting in section 2.16 of OpemMP 4.0). This is indeed the answer to your question and the end of the discussion - if you violate this prohibition, then anything can happen.

In practice, this prohibition allows implementations to assume that critical constructs with the same name will not be nested. One of the common implementation options is that a thread encountering a critical construct will wait until all threads , including those , to leave the construct. If he waits, the thread cannot go away. This leads to a dead end.



Critical constructs with different names can be nested. In this case, a deadlock is possible if the nesting is not consistent. Consider:

!$OMP PARALLEL

!$OMP CRITICAL (A)
!$OMP CRITICAL (B)      ! Thread one waiting here.
!...
!$OMP OMP CRITICAL (B)
!$OMP END CRITICAL (A)

!$OMP CRITICAL (B)
!$OMP CRITICAL (A)      ! Thread two waiting here.
!...
!$OMP OMP CRITICAL (A)
!$OMP END CRITICAL (B)

!$END PARALLEL

      

If this situation arises, the threads will wait a long time.

+4


source







All Articles