I have a fortran MPI code in which a compute intensive function is invoked on every element of a 2D array. I'm trying to split the tasks among the ranks. For example if there are 30 columns and 10 ranks, then each rank gets 3 columns. The following code does this split and gathers the results using allgather. But the final array doesn't have the values from all ranks.
program allgather
include 'mpif.h'
!create a 2 x 30 myarray
integer :: x=2,y=30
integer :: numprocs,myid
integer :: i,j,k,myelements,mycolumns,jb,je
integer*4,dimension(:),allocatable :: displacement,recvcnt
real :: checksum
real,dimension(:,:),allocatable :: myarr,combinedarr
call MPI_INIT(IERR)
call MPI_COMM_SIZE(MPI_COMM_WORLD,NUMPROCS,IERR)
call MPI_COMM_RANK(MPI_COMM_WORLD,MYID,IERR)
mycolumns = y/numprocs
myelements = x * mycolumns
allocate(displacement(numprocs),recvcnt(numprocs))
jb = 1 + ( myid * mycolumns )
je = ( myid + 1 ) * mycolumns
allocate(myarr(x,mycolumns))
allocate(combinedarr(x,y))
myarr(:,:) =0
do j=jb,je
do i=1,x
myarr(i,j) = 1
enddo
enddo
!myarr(:,:)=1
if(mod(y,numprocs) > 0) then
if(myid==numprocs-1) then
jb=(myid + 1) * mycolumns + 1
do j=jb,y
do i=1,x
myarr(i,j) = 1
enddo
enddo
endif
endif
combinedarr(:,:) =0
recvcnt(:)=myelements
do k=1,numprocs
displacement(k) = (k-1) *myelements
enddo
call MPI_ALLGATHERV(myarr,myelements,MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
if(mod(y,numprocs) > 0) then
recvcnt(:) = 0
recvcnt(numprocs) = (x*y) - myelements * (numprocs)
displacement(numprocs) = displacement(numprocs) + myelements
call MPI_ALLGATHERV(myarr,recvcnt(numprocs),MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
endif
if (myid==0) then
checksum=0
write(6,*) "mycolumns:",mycolumns,"myelements:",myelements
do j=1,y
do i=1,x
checksum = checksum + combinedarr(i,j)
enddo
enddo
write(6,*) checksum
endif
end
First of all, you are using
MPI_ALLGATHERV()
just asMPI_ALLGATHER()
and get no benefit from its ability to send different number of elements from/to each process. But that's not the error in your program. The error lies in the way it fillsmyarr
. You allocate it asmyarr(x,mycolumns)
but when filling it from columnjb
to columnje
, you go past the end of the array in all processes but rank0
sincejb
andje
are greater thanmycolumns
there. Thusmyarr
contains ones only in rank0
and zeroes in all other ranks. So, yes, the final array does not have the values that you expect but that's because you filled them wrong, not because of the way MPI subroutines are used.Writing past the end of an allocatable array destroys the hidden structures that are used to manage heap allocation and usually crashes the program. In your case you are just lucky - I run your code with Open MPI and it crashed with core dumps each time.
And you are also missing a call to
MPI_FINALIZE()
at the end of your code.Hint: use the Fortran 90 interface if available - replace
include 'mpif.h'
withuse mpi
here is the final version of the code. I have implemented the fixes suggested by "Hristo Iliev" and also fixed the part where the # or ranks does not equally divide the # of columns. Here the last rank does the computation on the leftover columns.