Limits with MPI_Send or MPI_Recv?

2019-09-01 12:13发布

问题:

Do we have any limits about message size on MPI_Send or MPI_Recv - or limits by computer? When I try to send large data, it can not completed. This is my code:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <math.h>
#include <string.h>

void AllGather_ring(void* data, int count, MPI_Datatype datatype,MPI_Comm communicator)
{
  int me;
  MPI_Comm_rank(communicator, &me);
  int world_size;
  MPI_Comm_size(communicator, &world_size);
  int next=me+1;
  if(next>=world_size)
      next=0;
  int prev=me-1;
  if(prev<0)
      prev=world_size-1;
  int i,curi=me;
  for(i=0;i<world_size-1;i++)
  {
     MPI_Send(data+curi*sizeof(int)*count, count, datatype, next, 0, communicator);
     curi=curi-1;
     if(curi<0)
         curi=world_size-1;
     MPI_Recv(data+curi*sizeof(int)*count, count, datatype, prev, 0, communicator, MPI_STATUS_IGNORE);
  }
}


void test(void* buff,int world_size,int count)
{
    MPI_Barrier(MPI_COMM_WORLD);
    AllGather_ring(buff,count,MPI_INT,MPI_COMM_WORLD);
    MPI_Barrier(MPI_COMM_WORLD);
    }
}
void main(int argc, char* argv[]) {
    int count = 20000;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    MPI_Init(&argc,&argv);
    int world_rank,world_size,namelen;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    int* buff=(int*) malloc(world_size*sizeof(int)*count);
      int i;
      for (i = 0; i < world_size; i++) {
          buff[i]=world_rank;
      }
    test(buff,world_size,count);
    MPI_Finalize();
}

It stopped when I try to run with a buffer about 80000 bytes (40000 integers) (by count = 20000 + 4 processes)

回答1:

You code is incorrect. You are posting the receives only after the respective sends are completed. MPI_Send is only guaranteed to complete after a corresponding MPI_Recv is posted, so you run into a classic deadlock.

It happens to work for small messages, because they are handled differently (using an unexpected message buffer as performance optimization). In that case MPI_Send is allowed to complete before the MPI_Recv is posted.

Alternatively you can:

  • Post immediate sends or receive (MPI_Isend, MPI_Irecv) to resolve the deadlock.
  • Use MPI_Sendrecv.
  • Use MPI_Allgather.

I recommend the latter.



标签: c mpi openmpi hpc