Combining OpenMP and Array with Fortran 90

Question

Combining OpenMP and Array with Fortran 90

I am trying to calculate the pressure tensor of a crystal structure. To do this, I need to go through all pairs of particles as in the simplified code below

  do i=1, atom_number      ! sum over atoms i
   type1 = ATOMS(i)
   do nj=POINT(i), POINT(i+1)-1  ! sum over atoms j of i atoms list
       j = LIST(nj)
       type2 = ATOMS(j) 
       call get_ff_param(type1,type2,Aab,Bab,Cab,Dab)
       call distance_sqr2(i,j,r,VECT_R)
       call gettensor_rij(i,j,T)
       r = sqrt(r)
       if (r<coub_cutoff) then
          local_sum_real(id+1) = local_sum_real(id+1) + ( erfc(alpha_ewald*r) 
          derivative = -( erfc(alpha_ewald*r)
          ff = - (1.0d0/r)*derivative  
          STRESS_EWALDD = STRESS_EWALDD + ff * T  ! A 3x3 matrix    
          FDX(i) = FDX(i) + VECT_R(1) * ff
          FDY(i) = FDY(i) + VECT_R(2) * ff
          FDZ(i) = FDZ(i) + VECT_R(3) * ff
          FDX(j) = FDX(j) - VECT_R(1) * ff
          FDY(j) = FDY(j) - VECT_R(2) * ff
          FDZ(j) = FDZ(j) - VECT_R(3) * ff     
       end if                                  
   end do               ! sum over atoms j 

   sum_kin = sum_kin + ATMMASS(i) * (VX(i)**2 + VY(i)**2 + VZ(i)**2)  
   call gettensor_v(i,Q)
   STRESS_KINETIC = STRESS_KINETIC + ATMMASS(i) * Q

 end do

I tried to parallelize this double loop with "paralle do" directive but got some problem with STRESS_EWALDD tensor and FDX forces .... So I tried to assign each thread manually the number of particles as in the code but still get the wrong tensor value.

!$omp parallel  private(id,j,i,nj,type1,type2,T,Q,r,VECT_R,derivative,Aab,Bab,Cab,Dab,ff)

 id = omp_get_thread_num()      
 do i=id+1, atom_number,nthreads       ! sum over atoms i
       type1 = ATOMS(i)
       do nj=POINT(i), POINT(i+1)-1  ! sum over atoms j of i atoms list
           j = LIST(nj)
           type2 = ATOMS(j) 
           call get_ff_param(type1,type2,Aab,Bab,Cab,Dab)
           call distance_sqr2(i,j,r,VECT_R)
           call gettensor_rij(i,j,T)
           r = sqrt(r)
           if (r<coub_cutoff) then
              local_sum_real(id+1) = local_sum_real(id+1) + ( erfc(alpha_ewald*r) 
              derivative = -( erfc(alpha_ewald*r)
              ff = - (1.0d0/r)*derivative  
          local_STRESS_EWALDD(id+1,:,:) = local_STRESS_EWALDD(id+1,:,:) + ff * T    
          FDX(i) = FDX(i) + VECT_R(1) * ff
              FDY(i) = FDY(i) + VECT_R(2) * ff
              FDZ(i) = FDZ(i) + VECT_R(3) * ff
              FDX(j) = FDX(j) - VECT_R(1) * ff
              FDY(j) = FDY(j) - VECT_R(2) * ff
              FDZ(j) = FDZ(j) - VECT_R(3) * ff     
           end if                                  
       end do               ! sum over atoms j 

   local_sum_kin(id+1) = local_sum_kin(id+1) + ATMMASS(i) * (VX(i)**2 + VY(i)**2 + VZ(i)**2)  
   call gettensor_v(i,Q)
   local_STRESS_KINETIC(id+1,:,:) = local_STRESS_KINETIC(id+1,:,:) + ATMMASS(i) * Q

 end do               ! sum over atoms i

!$omp end parallel

 do i=1,nthreads  
   sum_real = sum_real + local_sum_real(i)
   sum_virial_buck = sum_virial_buck + local_sum_virial_buck(i)
   STRESS_BUCK = STRESS_BUCK + local_STRESS_BUCK(i,:,:)
   STRESS_EWALDD = STRESS_EWALDD + local_STRESS_EWALDD(i,:,:)   
   sum_buck = sum_buck + local_sum_buck(i)
   sum_kin  = sum_kin + local_sum_kin(i)
   STRESS_KINETIC = STRESS_KINETIC + local_STRESS_KINETIC(i,:,:) 
 end do

The scalars and STRESS_KINETIC values are correct, but STRESS_EWALDD is wrong and I can't figure out why. I don't know about the powers yet. So I'm really very grateful for the hit. Thank,

      Éric.

0

multidimensional-array fortran90 openmp do-loops

Eric May 30 '12 @ 6:51 am

source to share

1 answer

Hristo iliev · Accepted Answer · 2012-05-30T08:05:29+0000

You've taken a somewhat unorthodox approach to using OpenMP.

OpenMP provides a proposal reduction(op:vars)

which performs a reduction by means op

of the local values of variables in vars

, and you should use it for STRESS_EWALD

, sum_kin

, sum_real

and STRESS_KINETIC

. Force accumulation for the i-th atom should work since the atoms in the Verlet list are ordered POINT

(you took the code that builds it from Allen and Tildsley's book, right?), But not for the j-th atom. This is why you must perform cuts on them too. Don't worry if you read in some outdated OpenMP manual that abbreviation variables must be scalar - newer versions of OpenMP support array variable abbreviation in Fortran 90+.

!$OMP PARALLEL DO PRIVATE(....)
!$OMP& REDUCTION(+:FDX,FDY,FDZ,sum_kin,sum_real,STRESS_EWALDD,STRESS_KINETIC)
!$OMP& SCHEDULE(DYNAMIC)
do i=1, atom_number      ! sum over atoms i
 type1 = ATOMS(i)
 do nj=POINT(i), POINT(i+1)-1  ! sum over atoms j of i atoms list
     j = LIST(nj)
     type2 = ATOMS(j) 
     call get_ff_param(type1,type2,Aab,Bab,Cab,Dab)
     call distance_sqr2(i,j,r,VECT_R)
     call gettensor_rij(i,j,T)
     r = sqrt(r)
     if (r<coub_cutoff) then
        sum_real = sum_real + ( erfc(alpha_ewald*r) 
        derivative = -( erfc(alpha_ewald*r)
        ff = - (1.0d0/r)*derivative  
        STRESS_EWALDD = STRESS_EWALDD + ff * T  ! A 3x3 matrix    
        FDX(i) = FDX(i) + VECT_R(1) * ff
        FDY(i) = FDY(i) + VECT_R(2) * ff
        FDZ(i) = FDZ(i) + VECT_R(3) * ff
        FDX(j) = FDX(j) - VECT_R(1) * ff
        FDY(j) = FDY(j) - VECT_R(2) * ff
        FDZ(j) = FDZ(j) - VECT_R(3) * ff     
     end if                                  
 end do               ! sum over atoms j 

 sum_kin = sum_kin + ATMMASS(i) * (VX(i)**2 + VY(i)**2 + VZ(i)**2)  
 call gettensor_v(i,Q)
 STRESS_KINETIC = STRESS_KINETIC + ATMMASS(i) * Q

end do
!$OMP END PARALLEL DO

Using dynamic cycle scheduling will reduce load imbalance when the number of neighbors for each atom varies greatly.

Combining OpenMP and Array with Fortran 90

More articles: