derived data types with MPI

2020-07-06 09:16发布

问题:

I'm learning about BCASTing data types in Fortran and have a code which takes two values from the terminal and displays them on each process. For the combination value1/value2 of type integer/integer and integer/real this works, however for the combination integer/real*8 it fails.

The code is:

use mpi
implicit none

integer :: ierror, pid, ncpu, root = 0

integer :: counts, newtype, extent
integer, dimension(2) :: oldtypes, blockcounts, offsets

type value
    integer :: value1 = 0
    real*8 :: value2
end type

type (value) input

call MPI_INIT(ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, pid, ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, ncpu, ierror)

! setup of 1 MPI_INTEGER field: value1
offsets(1) = 0
oldtypes(1) = MPI_INTEGER
blockcounts(1) = 1

! setup of 1 MPI_REAL8 field: value2
call MPI_TYPE_EXTENT(MPI_INTEGER, extent, ierror)  !determine offset of MPI_INTEGER
offsets(2) = blockcounts(1)*extent                 !offset is 1 MPI_INTEGER extents
oldtypes(2) = MPI_REAL8
blockcounts(2) = 1

! define struct type and commit
counts = 2 !for MPI_INTEGER + MPI_REAL8
call MPI_TYPE_STRUCT(counts, blockcounts, offsets, & 
                     oldtypes, newtype, ierror)
call MPI_TYPE_COMMIT(newtype, ierror)

do while (input%value1 >= 0)
    if (pid == root) then
        read(*,*) input
        write(*,*) 'input was: ', input
    end if
    call MPI_BCAST(input, 1, newtype, &
                   root, MPI_COMM_WORLD, ierror)
    write(*,*), 'process ', pid, 'received: ', input
end do

call MPI_TYPE_FREE(newtype, ierror)
call MPI_FINALIZE(ierror)

It can be checked that integer/integer and integer/real work fine by changing the corresponding declaration and oldtype. The combination integer/real*8 fails with e.g. inputs -1 2.0 generating:

input was:           -1   2.0000000000000000     
process            0 received:           -1   2.0000000000000000     
process            1 received:           -1   0.0000000000000000     
process            2 received:           -1   0.0000000000000000     
process            3 received:           -1   0.0000000000000000

This thread with a similar issue suggest that using MPI_TYPE_EXTENT is not correct as there might be additional padding which is not taken into account. unfortunately i havent been able to fix the problem and hope someone here can enlighten me.

thx in advance

回答1:

You have the basic idea right - you've created the structure, but you're assuming that the double precision value is stored immediately following the integer value, and that generally isn't correct. Hristo's answer that you link to gives a good answer in C.

The issue is that the compiler will normally align your data structure fields for you. Most systems can read/write values that are aligned in memory much faster than they can perform non-aligned accesses, if they can perform those at all. Typically, the requirement is that the alignment is on element sizes; that is a 8-byte double precision number will have to be aligned to 8-byte boundaries (that is, the address of it's first byte is zero modulo 8) whereas the integer only has to be 4-byte aligned. This almost certainly means that there are 4 bytes of padding between the integer and the double.

In many cases you can cajole the compiler into relaxing this behaviour - in fortran, you can also use the sequence keyword to require that the data be stored contiguously. Either way, from a performance point of view (which is why you're using Fortran and MPI, one assumes) this is almost never the right thing to do, but it can be useful for byte-to-byte compatibility with other externally imposed data types or formats.

Given the likely imposed padding for performance reasons, you could assume the alignment and hardcode that into your program; but that probably isn't the right thing to do, either; if you add other fields, or change the kind of the real number to be a 4-byte single precision number, etc, your code would be wrong again. Best is to use MPI_Get_address to explicitly find the locations and calculate the correct offsets yourself:

integer(kind=MPI_Address_kind) :: startloc, endloc    
integer :: counts, newtype
integer, dimension(2) :: oldtypes, blockcounts, offsets

type value
    integer :: value1 = 0
    double precision :: value2
end type

type (value) :: input

!...    

! setup of 1 MPI_INTEGER field: value1
call MPI_Get_address(input, startloc, ierror)
oldtypes(1) = MPI_INTEGER
blockcounts(1) = 1
call MPI_Get_address(input%value1, endloc, ierror)
offsets(1) = endloc - startloc

oldtypes(2) = MPI_DOUBLE_PRECISION
blockcounts(2) = 1
call MPI_Get_address(input%value2, endloc, ierror)
offsets(2) = endloc - startloc

if (pid == 0) then
    print *,'offsets are: ', offsets
endif

Note that if you had an array of such derived types, to cover the case of padding between the last element of one item and the start of the next, you'd want to explicitly measure that, as well, and set the overall size of the type - the offset between the start of one member of that type and the start of the next - with MPI_Type_create_resized.