Seg when sending data of derived type with allocatable array in mpi

Question

Seg when sending data of derived type with allocatable array in mpi

I am trying to send data of derived type with allocated array in mpi declaration getting seg error.

program test_type  

 use mpi

 implicit none

 type mytype
  real,allocatable::x(:)
  integer::a
 end type mytype

 type(mytype),allocatable::y(:)
 type(mytype)::z
 integer::n,i,ierr,myid,ntasks,status,request
 integer :: datatype, oldtypes(2), blockcounts(2) 
 integer(KIND=MPI_ADDRESS_KIND) :: offsets(2)

 call mpi_init(ierr)
 call mpi_comm_rank(mpi_comm_world,myid,ierr)
 call mpi_comm_size(mpi_comm_world,ntasks,ierr)

 n=2

 allocate(z%x(n))

 if(myid==0)then
  allocate(y(ntasks-1))
  do i=1,ntasks-1
   allocate(y(i)%x(n))
  enddo
 else
  call random_number(z%x)
  z%a=myid
  write(0,*) "z in process", myid, z%x, z%a
 endif

 call mpi_get_address(z%x,offsets(1),ierr)
 call mpi_get_address(z%a,offsets(2),ierr)
 offsets=offsets-offsets(1)

 oldtypes=(/ mpi_real,mpi_integer /)
 blockcounts=(/ n,1 /)

 write(0,*) "before commit",myid,offsets,blockcounts,oldtypes
 call mpi_type_create_struct(2,blockcounts,offsets,oldtypes,datatype,ierr) 
 call mpi_type_commit(datatype, ierr)
 write(0,*) "after commit",myid,datatype, ierr

 if(myid==0) then   
  do i=1,ntasks-1 
   call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr) 
   write(0,*) "received", y(i)%x,y(i)%a
  enddo
 else
  call mpi_isend(z,1,datatype,0,0,mpi_comm_world,request,ierr) 
  write(0,*) "sent"
  write(0,*) myid, z%x, z%a
 end if

 call mpi_finalize(ierr)

end program

And this is what I printed out with two processes:

before commit           0                     0             -14898056
           2           1          13           7
 after commit           0          73           0
 z in process           1  3.9208680E-07  2.5480442E-02           1
 before commit           1                     0            -491689432
           2           1          13           7
 after commit           1          73           0
 received  0.0000000E+00  0.0000000E+00           0
forrtl: severe (174): SIGSEGV, segmentation fault occurred

It seems that a negative address offset is happening. Please help. Thank.

+3

fortran mpi allocatable-array

Huy Le Jul 24 15 at 16:16

source to share

1 answer

Hristo iliev · Accepted Answer · 2015-07-24T22:46:56+0000

Several problems arise with this code.

Allocated arrays with most Fortran compilers are like pointers in C / C ++: the real object behind the array name is what holds the pointer to the allocated data. This data is usually allocated on the heap, and it could be anywhere in the process's virtual address space, which explains the negative offset. By the way, negative offsets are perfectly acceptable in MPI datatypes (which MPI_ADDRESS_KIND

is why it indicates signed integer form), so there isn't much of a problem here.

The big problem is that offsets between dynamically allocated things are usually different with each allocation. You can check that:

ADDR(y(1)%x) - ADDR(y(1)%a)

completely different from

ADDR(y(i)%x) - ADDR(y(i)%a), for i = 2..ntasks-1

( ADDR

here only the shorhand note for the object address returned MPI_GET_ADDRESS

)

Even if it does, the offsets correspond to some value (s) i

, which looks more like a match than a rule.

This results in the following: the type you create using offsets from the variable z

cannot be used to send array elements y

. To fix this problem, simply remove the allocatable property mytype%x

if possible (for example, if n

known in advance).

Another option that should work well for small values ntasks

is to define the number of MPI data types as the number of array elements y

. Then use datatype(i)

which is based on offsets y(i)%x

and y(i)%a

to submit y(i)

.

The bigger problem is the fact that you use non-blocking MPI operations and never wait for them to complete before accessing the data buffers. This code just doesn't work:

do i=1,ntasks-1 
 call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr) 
 write(0,*) "received", y(i)%x,y(i)%a
enddo

The call MPI_IRECV

starts an asynchronous receive operation. The operation is probably still in progress at the time the statement is executed WRITE

, so it is accessed entirely by random data (some memory allocators may actually zero out data in debug mode). Either insert the call MPI_WAIT

between calls MPI_ISEND

and WRITE

or use the blocking reception MPI_RECV

.

A similar problem exists with using a non-blocking send call MPI_ISEND

. Since you never wait for the request or test for it to complete, the MPI library is allowed to indefinitely defer the actual progress of the operation, and the dispatch may never happen. Again, since there is absolutely no excuse for using non-blocking dispatch in your case, replace MPI_ISEND

with MPI_SEND

.

Last but not least, rank 0 only receives messages from rank 1:

call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr)
                              ^^^

At the same time, all other processes are sent to rank 0. Therefore, your program will only run when run with two MPI processes. You may want to replace the stressed 1

in admission reception with help i

.

Seg when sending data of derived type with allocatable array in mpi

More articles: