Seg when sending data of derived type with allocatable array in mpi
I am trying to send data of derived type with allocated array in mpi declaration getting seg error.
program test_type
use mpi
implicit none
type mytype
real,allocatable::x(:)
integer::a
end type mytype
type(mytype),allocatable::y(:)
type(mytype)::z
integer::n,i,ierr,myid,ntasks,status,request
integer :: datatype, oldtypes(2), blockcounts(2)
integer(KIND=MPI_ADDRESS_KIND) :: offsets(2)
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world,myid,ierr)
call mpi_comm_size(mpi_comm_world,ntasks,ierr)
n=2
allocate(z%x(n))
if(myid==0)then
allocate(y(ntasks-1))
do i=1,ntasks-1
allocate(y(i)%x(n))
enddo
else
call random_number(z%x)
z%a=myid
write(0,*) "z in process", myid, z%x, z%a
endif
call mpi_get_address(z%x,offsets(1),ierr)
call mpi_get_address(z%a,offsets(2),ierr)
offsets=offsets-offsets(1)
oldtypes=(/ mpi_real,mpi_integer /)
blockcounts=(/ n,1 /)
write(0,*) "before commit",myid,offsets,blockcounts,oldtypes
call mpi_type_create_struct(2,blockcounts,offsets,oldtypes,datatype,ierr)
call mpi_type_commit(datatype, ierr)
write(0,*) "after commit",myid,datatype, ierr
if(myid==0) then
do i=1,ntasks-1
call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr)
write(0,*) "received", y(i)%x,y(i)%a
enddo
else
call mpi_isend(z,1,datatype,0,0,mpi_comm_world,request,ierr)
write(0,*) "sent"
write(0,*) myid, z%x, z%a
end if
call mpi_finalize(ierr)
end program
And this is what I printed out with two processes:
before commit 0 0 -14898056
2 1 13 7
after commit 0 73 0
z in process 1 3.9208680E-07 2.5480442E-02 1
before commit 1 0 -491689432
2 1 13 7
after commit 1 73 0
received 0.0000000E+00 0.0000000E+00 0
forrtl: severe (174): SIGSEGV, segmentation fault occurred
It seems that a negative address offset is happening. Please help. Thank.
Several problems arise with this code.
Allocated arrays with most Fortran compilers are like pointers in C / C ++: the real object behind the array name is what holds the pointer to the allocated data. This data is usually allocated on the heap, and it could be anywhere in the process's virtual address space, which explains the negative offset. By the way, negative offsets are perfectly acceptable in MPI datatypes (which MPI_ADDRESS_KIND
is why it indicates signed integer form), so there isn't much of a problem here.
The big problem is that offsets between dynamically allocated things are usually different with each allocation. You can check that:
ADDR(y(1)%x) - ADDR(y(1)%a)
completely different from
ADDR(y(i)%x) - ADDR(y(i)%a), for i = 2..ntasks-1
( ADDR
here only the shorhand note for the object address returned MPI_GET_ADDRESS
)
Even if it does, the offsets correspond to some value (s) i
, which looks more like a match than a rule.
This results in the following: the type you create using offsets from the variable z
cannot be used to send array elements y
. To fix this problem, simply remove the allocatable property mytype%x
if possible (for example, if n
known in advance).
Another option that should work well for small values ntasks
is to define the number of MPI data types as the number of array elements y
. Then use datatype(i)
which is based on offsets y(i)%x
and y(i)%a
to submit y(i)
.
The bigger problem is the fact that you use non-blocking MPI operations and never wait for them to complete before accessing the data buffers. This code just doesn't work:
do i=1,ntasks-1
call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr)
write(0,*) "received", y(i)%x,y(i)%a
enddo
The call MPI_IRECV
starts an asynchronous receive operation. The operation is probably still in progress at the time the statement is executed WRITE
, so it is accessed entirely by random data (some memory allocators may actually zero out data in debug mode). Either insert the call MPI_WAIT
between calls MPI_ISEND
and WRITE
or use the blocking reception MPI_RECV
.
A similar problem exists with using a non-blocking send call MPI_ISEND
. Since you never wait for the request or test for it to complete, the MPI library is allowed to indefinitely defer the actual progress of the operation, and the dispatch may never happen. Again, since there is absolutely no excuse for using non-blocking dispatch in your case, replace MPI_ISEND
with MPI_SEND
.
Last but not least, rank 0 only receives messages from rank 1:
call mpi_irecv(y(i),1,datatype,1,0,mpi_comm_world,request,ierr)
^^^
At the same time, all other processes are sent to rank 0. Therefore, your program will only run when run with two MPI processes. You may want to replace the stressed 1
in admission reception with help i
.