Optimizing fortran code with intel VTune analyzer

I am working with a fortran project to simulate vegetation dynamics. The code is slow, so I am always looking for ways to optimize it. I've read that there is a "rule" that says that typically 90% of the time is spent on 10% of the code. To find out about these bottlenecks, I started using the Intel VTune Performance Analyzer. Analysis of the simulation shows that a large amount of time is spent on certain parts of the code, as shown in the images Picture 1. The following figure shows the most expensive part leaftw_derivs

.Figure 2

Below is the code mentioned in the analysis.

   !---- Update soil moisture and energy from transpiration/root uptake. ------------------!
   if (rk4aux(ibuff)%any_resolvable) then
      do k1 = klsl, mzg    ! loop over extracted water
         do k2=k1,mzg
            if (rk4site%ntext_soil(k2) /= 13) then
               !---------------------------------------------------------------------------!
               !     Transpiration happens only when there is some water left down to this !
               ! layer.                                                                    !
               !---------------------------------------------------------------------------!
               if (rk4aux(ibuff)%avail_h2o_int(k1) > 0.d0) then
                  !------------------------------------------------------------------------!
                  !    Find the contribution of layer k2 for the transpiration from        !
                  ! cohorts that reach layer k1.                                           !
                  !------------------------------------------------------------------------!
                  ext_weight = rk4aux(ibuff)%avail_h2o_lyr(k2) / rk4aux(ibuff)%avail_h2o_int(k1)

                  !------------------------------------------------------------------------!
                  wloss_tot      = 0.d0
                  qloss_tot      = 0.d0
                  wvlmeloss_tot  = 0.d0
                  qvlmeloss_tot  = 0.d0

                  do ico=1,cpatch%ncohorts
                     !----- Find the loss from this cohort. -------------------------------!
                     wloss         = rk4aux(ibuff)%extracted_water(ico,k1) * ext_weight
                     qloss         = wloss * tl2uint8(initp%soil_tempk(k2),1.d0)
                     wvlmeloss     = wloss * wdnsi8 * dslzi8(k2)
                     qvlmeloss     = qloss * dslzi8(k2)
                     !---------------------------------------------------------------------!


                     !---------------------------------------------------------------------!
                     !      Add the internal energy to the cohort.  This energy will be    !
                     ! eventually lost to the canopy air space because of transpiration,   !
                     ! but we will do it in two steps so we ensure energy is conserved.    !
                     !---------------------------------------------------------------------!
                     dinitp%leaf_energy(ico) = dinitp%leaf_energy(ico)  + qloss
                     dinitp%veg_energy(ico)  = dinitp%veg_energy(ico)   + qloss
                     initp%hflx_lrsti(ico) = initp%hflx_lrsti(ico)      + qloss
                     !---------------------------------------------------------------------!

                     !----- Integrate the total to be removed from this layer. ------------!
                     wloss_tot     = wloss_tot     + wloss
                     qloss_tot     = qloss_tot     + qloss
                     wvlmeloss_tot = wvlmeloss_tot + wvlmeloss
                     qvlmeloss_tot = qvlmeloss_tot + qvlmeloss
                     !---------------------------------------------------------------------!
                  end do
                  !------------------------------------------------------------------------!



                  !----- Update derivatives of water, energy, and transpiration. ----------!
                  dinitp%soil_water   (k2) = dinitp%soil_water(k2)    - wvlmeloss_tot
                  dinitp%soil_energy  (k2) = dinitp%soil_energy(k2)   - qvlmeloss_tot
                  dinitp%avg_transloss(k2) = dinitp%avg_transloss(k2) - wloss_tot
                  !------------------------------------------------------------------------!
               end if
               !---------------------------------------------------------------------------!
            end if
            !------------------------------------------------------------------------------!
         end do
         !---------------------------------------------------------------------------------!
      end do
      !------------------------------------------------------------------------------------!
   end if
   !---------------------------------------------------------------------------------------!

      

I have a very basic understanding of optimization, but I don't see what can be done here to improve the code. In particular, I do not understand what the Resignation Instructions mean and how it is done. Is there a way to speed up calculations?

EDIT

After thinking a little, I realized that there are some simple optimizations here. For example, moving a conditional if (rk4aux(ibuff)%avail_h2o_int(k1) > 0.d0) then

outside the loop, as well as moving tl2uint8(initp%soil_tempk(k2),1.d0)

outside the innermost loop.

However, I cannot figure out the reason for the supposedly long time that VTune gives: 3 lines

             dinitp%leaf_energy(ico) = dinitp%leaf_energy(ico)  + qloss
             dinitp%veg_energy(ico)  = dinitp%veg_energy(ico)   + qloss
             initp%hflx_lrsti(ico) = initp%hflx_lrsti(ico)      + qloss

      

just do the addition. It should be very fast, but instead the analyzer says it spends a lot of time there. Why is this so?

EDIT2

I rewrote the whole loop trying to optimize as much as I could. This is the code I came up with

   !---- Update soil moisture and energy from transpiration/root uptake. ------------------!
   if (rk4aux(ibuff)%any_resolvable) then
      do k1 = klsl, mzg    ! loop over extracted water

               !---------------------------------------------------------------------------!
               !     Transpiration happens only when there is some water left down to this !
               ! layer.                                                                    !
               !---------------------------------------------------------------------------!
               if (rk4aux(ibuff)%avail_h2o_int(k1) > 0.d0) then

                wloss_tot_k1 = 0.d0

                do ico=1,cpatch%ncohorts
                     !----- Integrate the total to be removed from this layer. ------------!
                     wloss_tot_k1 = wloss_tot_k1 + rk4aux(ibuff)%extracted_water(ico,k1)                     
                     !---------------------------------------------------------------------!
                end do
                  !------------------------------------------------------------------------!

                  do k2=k1,mzg
                    if (rk4site%ntext_soil(k2) /= 13) then
                  do ico=1,cpatch%ncohorts
                     wloss         = rk4aux(ibuff)%extracted_water(ico,k1) * ext_weight
                     uint_here1    = wloss * uint_here

                     dinitp%leaf_energy(ico) = dinitp%leaf_energy(ico) + uint_here1
                     dinitp%veg_energy(ico)  = dinitp%veg_energy(ico)  + uint_here1
                     initp%hflx_lrsti(ico)   = initp%hflx_lrsti(ico)   + uint_here1
                  end do
                  !------------------------------------------------------------------------!

                  wloss_tot     = wloss_tot_k1 * ext_weight                   
                  wvlmeloss_tot = wloss_tot * dslzi8(k2) * wdnsi8
                  qvlmeloss_tot = wloss_tot * dslzi8(k2) * uint_here


                  !----- Update derivatives of water, energy, and transpiration. ----------!
                  dinitp%soil_water   (k2) = dinitp%soil_water(k2)    - wvlmeloss_tot
                  dinitp%soil_energy  (k2) = dinitp%soil_energy(k2)   - qvlmeloss_tot
                  dinitp%avg_transloss(k2) = dinitp%avg_transloss(k2) - wloss_tot
                  !------------------------------------------------------------------------!


               end if
               !---------------------------------------------------------------------------!
            end do
            !------------------------------------------------------------------------------!
         end if
         !---------------------------------------------------------------------------------!
      end do
      !------------------------------------------------------------------------------------!
   end if
   !---------------------------------------------------------------------------------------!

      

It's a bit long, so I don't expect people to go through with it. If I run the analyzer, I now get significantly reduced times (from 290 to 185, although the speed appears to be slightly slower in real simulators).New times

However, when looking at the sample, there is still a significant amount of time spent on operations and I would not expect "expensive". I still don't understand what deprecated instructions mean and how it's done. For now, I think that's enough, and I'm guessing that the correct way to further speed up would be to use openMP capabilities as Holmes suggests.

enter image description here enter image description here

+3


source to share





All Articles