Seg when sending data of derived type with allocatable array in mpi
I am trying to send data of derived type with allocated array in mpi declaration getting seg error.
program test_type use mpi implicit none type mytype real,allocatable::x(:) integer::a end type mytype type(mytype),allocatable::y(:) type(mytype)::z integer::n,i,ierr,myid,ntasks,status,request integer :: datatype, oldtypes(2), blockcounts(2) integer(KIND=MPI_ADDRESS_KIND) :: offsets(2) call mpi_init(ierr) call mpi_comm_rank(mpi_comm_world,myid,ierr) call mpi_comm_size(mpi_comm_world,ntasks,ierr) n=2 allocate(z%x(n)) if(myid==0)then allocate(y(ntasks-1)) do i=1,ntasks-1 allocate(y(i)%x(n)) enddo else call random_number(z%x) z%a=myid write(0,*) "z in process", myid, z%x, z%a endif call mpi_get_address(z%x,offsets(1),ierr) call mpi_get_address(z%a,offsets(2),ierr) offsets=offsets-offsets(1) oldtypes=(/ mpi_real,mpi_integer /) blockcounts=(/ n,1 /) write(0,*) "before commit",myid,offsets,blockcounts,oldtypes call mpi_type_create_struct(2,blockcounts,offsets,oldtypes,datatype,ierr) call mpi_type_commit(datatype, ierr) write(0,*) "after commit",myid,datatype, ierr if(myid==0) then do i=1,ntasks-1 call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr) write(0,*) "received", y(i)%x,y(i)%a enddo else call mpi_isend(z,1,datatype,0,0,mpi_comm_world,request,ierr) write(0,*) "sent" write(0,*) myid, z%x, z%a end if call mpi_finalize(ierr) end program
And this is what I printed out with two processes:
before commit 0 0 -14898056 2 1 13 7 after commit 0 73 0 z in process 1 3.9208680E-07 2.5480442E-02 1 before commit 1 0 -491689432 2 1 13 7 after commit 1 73 0 received 0.0000000E+00 0.0000000E+00 0 forrtl: severe (174): SIGSEGV, segmentation fault occurred
It seems that a negative address offset is happening. Please help. Thank.
source to share
Several problems arise with this code.
Allocated arrays with most Fortran compilers are like pointers in C / C ++: the real object behind the array name is what holds the pointer to the allocated data. This data is usually allocated on the heap, and it could be anywhere in the process's virtual address space, which explains the negative offset. By the way, negative offsets are perfectly acceptable in MPI datatypes (which
is why it indicates signed integer form), so there isn't much of a problem here.
The big problem is that offsets between dynamically allocated things are usually different with each allocation. You can check that:
ADDR(y(1)%x) - ADDR(y(1)%a)
completely different from
ADDR(y(i)%x) - ADDR(y(i)%a), for i = 2..ntasks-1
here only the shorhand note for the object address returned
Even if it does, the offsets correspond to some value (s)
, which looks more like a match than a rule.
This results in the following: the type you create using offsets from the variable
cannot be used to send array elements
. To fix this problem, simply remove the allocatable property
if possible (for example, if
known in advance).
Another option that should work well for small values
is to define the number of MPI data types as the number of array elements
. Then use
which is based on offsets
The bigger problem is the fact that you use non-blocking MPI operations and never wait for them to complete before accessing the data buffers. This code just doesn't work:
do i=1,ntasks-1 call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr) write(0,*) "received", y(i)%x,y(i)%a enddo
starts an asynchronous receive operation. The operation is probably still in progress at the time the statement is executed
, so it is accessed entirely by random data (some memory allocators may actually zero out data in debug mode). Either insert the call
or use the blocking reception
A similar problem exists with using a non-blocking send call
. Since you never wait for the request or test for it to complete, the MPI library is allowed to indefinitely defer the actual progress of the operation, and the dispatch may never happen. Again, since there is absolutely no excuse for using non-blocking dispatch in your case, replace
Last but not least, rank 0 only receives messages from rank 1:
call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr) ^^^
At the same time, all other processes are sent to rank 0. Therefore, your program will only run when run with two MPI processes. You may want to replace the stressed
in admission reception with help
source to share