Releasing MPI with collective functions
I am writing a simple program in C with MPI library. The purpose of this program is as follows:
I have a group of processes that perform iterative cycle, at the end of this cycle all processes in the communicator must call two collective functions ( MPI_Allreduce
and MPI_Bcast
). The first sends the identifier of the processes that generated the minimum value of the variable num.val
, and the second transfers from the source num_min.idx_v
to all processes in the communicator MPI_COMM_WORLD
.
The problem is that I don't know if the i-th process will finish before the collective functions are called. All processes have a 1/10 probability of completion. This simulates the behavior of a real program that I am implementing. And when the first process ends, the others are deadlocked.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
typedef struct double_int{
double val;
int idx_v;
}double_int;
int main(int argc, char **argv)
{
int n = 10;
int max_it = 4000;
int proc_id, n_proc;double *x = (double *)malloc(n*sizeof(double));
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &n_proc);
MPI_Comm_rank(MPI_COMM_WORLD, &proc_id);
srand(proc_id);
double_int num_min;
double_int num;
int k;
for(k = 0; k < max_it; k++){
num.idx_v = proc_id;
num.val = rand()/(double)RAND_MAX;
if((rand() % 10) == 0){
printf("iter %d: proc %d terminato\n", k, proc_id);
MPI_Finalize();
exit(EXIT_SUCCESS);
}
MPI_Allreduce(&num, &num_min, 1, MPI_DOUBLE_INT, MPI_MINLOC, MPI_COMM_WORLD);
MPI_Bcast(x, n, MPI_DOUBLE, num_min.idx_v, MPI_COMM_WORLD);
}
MPI_Finalize();
exit(EXIT_SUCCESS);
}
Perhaps I need to create a new group and a new communicator before calling the MPI_Finalize function in the if statement? How do I solve this?
source to share
If you have control of the process before it finishes, you should send a non-blocking flag to a rank that cannot be broken before (let's call it the root rank). Then instead of blocking all_reduce, you can send from all ranks to the root rank with their value.
The rank of the root can publish a non-blocking technique for a possible flag and value. All titles had to send one or the other. Once all the ranks are accounted for, you can make a reduction on the root rank, remove from the ranked ranks, and pass it.
If your ranks go out without warning, I'm not sure what your options are.
source to share