MPI_Type_Create_Hindexed_Block generates wrong ext

2019-07-11 21:03发布

问题:

Using Fortran,I'm trying to build a derived datatype for dynamically allocated structs,but it got wrong extent of the new type, codes are as follows:

PROGRAM MAIN
IMPLICIT NONE
INCLUDE 'mpif.h'
INTEGER :: I
INTEGER :: MYID,NUMPROCS,IError
INTEGER :: Extent,Size,Disp(2)
INTEGER :: Status(MPI_STATUS_SIZE)
INTEGER :: New_Type, Blocks(3), Types(3), Offsets(3), POS(2)
INTEGER :: POS_(4)
INTEGER :: ElmOffset(3),Send_Type
INTEGER :: M

TYPE Struct    
    INTEGER :: N
    REAL :: A
    REAL :: B(2)
END TYPE Struct
TYPE(Struct),ALLOCATABLE :: Structs(:)

    M=9

    CALL MPI_INIT( IError )
    CALL MPI_COMM_SIZE( MPI_COMM_WORLD, NUMPROCS, IError )
    CALL MPI_COMM_RANK( MPI_COMM_WORLD, MYID,     IError )

    ALLOCATE( Structs(M) )
    DO I=1,M
        Structs(I)%N = I*1000 + MYID
        Structs(I)%A = 250.0_8 + MYID*1.0
        Structs(I)%B(1) = 10.0_8 + MYID*1.0
        Structs(I)%B(2) = 20.0_8 + MYID*1.0
    END DO

    CALL MPI_GET_ADDRESS( Structs(1)%N,    POS_(1), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%A,    POS_(2), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%B(1), POS_(3), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%B(2), POS_(4), IError )
    POS_=POS_ - POS_(1)
    IF (MYID.EQ.0) THEN
        WRITE(*,*) MYID, POS_
    END IF

    Types(1) = MPI_INTEGER
    Types(2) = MPI_DOUBLE_PRECISION
    Types(3) = MPI_DOUBLE_PRECISION

    Offsets(1) = 0
    CALL MPI_GET_ADDRESS( Structs(1)%N, Disp(1), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%A, Disp(2), IError )
    Offsets(2) = Offsets(1) + Blocks(1)*( Disp(2)-Disp(1) )
    Disp(1) = Disp(2)
    CALL MPI_GET_ADDRESS( Structs(1)%B(1), Disp(2), IError )
    Offsets(3) = Offsets(2) + Blocks(2)*( Disp(2)-Disp(1) )

    CALL MPI_TYPE_STRUCT( 3, Blocks, Offsets, Types, New_Type, IError )
    CALL MPI_TYPE_COMMIT( New_Type, IError )

    CALL MPI_TYPE_EXTENT(New_Type, Extent, IError)
    CALL MPI_TYPE_SIZE(New_Type, Size, IError)
    IF (MYID.EQ.0) THEN
        WRITE(*,*) 'New_Type extents = ', Extent
        WRITE(*,*) 'New_Type size = ', Size
    END IF

    CALL MPI_GET_ADDRESS( Structs(1)%N, ElmOffset(1), IError )
    CALL MPI_GET_ADDRESS( Structs(2)%N, ElmOffset(2), IError )
    CALL MPI_GET_ADDRESS( Structs(3)%N, ElmOffset(3), IError )
    ElmOffset=ElmOffset - ElmOffset(1)

    IF (MYID.EQ.0) THEN
        WRITE(*,*) MYID,ElmOffset
    END IF

    CALL MPI_TYPE_CREATE_HINDEXED_BLOCK( 3, 1, ElmOffset, New_Type, Send_Type, IError )
    CALL MPI_TYPE_COMMIT( Send_Type, IError )

    CALL MPI_TYPE_EXTENT( Send_Type, Extent, IError )
    CALL MPI_TYPE_SIZE( Send_Type, Size, IError )

    IF (MYID.EQ.0) THEN
        WRITE(*,*) 'Send_Type extents = ', Extent
        WRITE(*,*) 'Send_Type size = ', Size
    END IF

    CALL MPI_TYPE_FREE(Send_Type,IError)
    CALL MPI_TYPE_FREE(New_Type,IError)
    CALL MPI_FINALIZE(IError)

END PROGRAM MAIN

The output are as follows:

            POS_ : 0  8  16  24
New_Type Extents : 32
   New_Type Size : 28

Results above show no problem

      ElemOffsets :  0  32  64
Send_Type Extents : -32             <= Problem is here !!! It should be 96
   Send_Type Size :  84

I actually want to send 3 blocks of Structs using the derived data type: Send_Type

IF (MYID.EQ.0) THEN
    DO I=1,(NUMPROCS-1)
         CALL MPI_SEND( Structs(1)%N, 1, Send_Type, I, 0, MPI_COMM_WORLD, IError)       
ELSE
    CALL MPI_RECV( Structs(1)%N, 1, Send_Type, 0, 0, MPI_COMM_WORLD, Status, IError)

END IF

WRITE( (MYID+10),*) Structs(1)%N, Structs(1)%A
WRITE( (MYID+10),*) Structs(1)%B(1), Structs(1)%B(2)

WRITE( (MYID+100),*) Structs(3)%N, Structs(3)%A
WRITE( (MYID+100),*) Structs(3)%B(1), Structs(3)%B(2)

But, there shows error: Program Exception - access violation

I don't know what's wrong ... But it must be that the Send_Type is not properly created

How can such problem be solved ?

回答1:

The problem is due to the fact that on 64bit OS, the size of adresses is larger than a 32 bit integer. Hence, the function int MPI_Get_address(const void *location, MPI_Aint *address) outputs an MPI_Aint, large enough to contain an adress. Indeed, an MPI_Aint can be larger than an MPI_INT.

In Fortran, the MPI_Aint writes INTEGER (KIND=MPI_ADDRESS_KIND). See also MPI_Aint in MPI_(I)NEIGHBOR_ALLTOALLW() vs int in MPI_(I)ALLTOALLW() and section 2.5.6 of the MPI Standard on page 48.

Consequently, the datatype INTEGER (KIND=MPI_ADDRESS_KIND) must be used whenever adresses are involved (for POS_, Disp, Offset, Extent and ElmOffset).

A corrected sample code based on yours, to be compiled by mpif90 main.f90 -o main -Wall and ran by mpirun -np 2 main writes:

PROGRAM MAIN
IMPLICIT NONE
INCLUDE 'mpif.h'
INTEGER :: I
INTEGER :: MYID,NUMPROCS,IError
INTEGER :: Size
INTEGER :: Status(MPI_STATUS_SIZE)
INTEGER :: New_Type, Blocks(3), Types(3)
INTEGER :: Send_Type
INTEGER :: M
INTEGER (KIND=MPI_ADDRESS_KIND):: Offsets(3),POS_(4), ElmOffset(3), Disp(2),Extent

TYPE Struct    
    INTEGER :: N
    REAL*8 :: A
    REAL*8 :: B(2)
END TYPE Struct
TYPE(Struct),ALLOCATABLE :: Structs(:)
    WRITE(*,*) 'Size of Integer = ',SIZEOF(M)
    WRITE(*,*) 'Size of Integer (KIND=MPI_ADDRESS_KIND)= ',SIZEOF(Extent)
    M=9

    CALL MPI_INIT( IError )
    CALL MPI_COMM_SIZE( MPI_COMM_WORLD, NUMPROCS, IError )
    CALL MPI_COMM_RANK( MPI_COMM_WORLD, MYID,     IError )

    ALLOCATE( Structs(M) )
    DO I=1,M
        Structs(I)%N = I*1000 + MYID
        Structs(I)%A = 250.0_8 + MYID*1.0
        Structs(I)%B(1) = 10.0_8 + MYID*1.0
        Structs(I)%B(2) = 20.0_8 + MYID*1.0
    END DO

    Blocks(1)=1
    Blocks(2)=1
    Blocks(3)=2

    CALL MPI_GET_ADDRESS( Structs(1)%N,    POS_(1), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%A,    POS_(2), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%B(1), POS_(3), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%B(2), POS_(4), IError )
    POS_=POS_ - POS_(1)
    IF (MYID.EQ.0) THEN
        WRITE(*,*) MYID, POS_
    END IF

    Types(1) = MPI_INTEGER
    Types(2) = MPI_DOUBLE_PRECISION
    Types(3) = MPI_DOUBLE_PRECISION

    Offsets(1) = 0
    CALL MPI_GET_ADDRESS( Structs(1)%N, Disp(1), IError )
    CALL MPI_GET_ADDRESS( Structs(1)%A, Disp(2), IError )
    !Offsets(2) = Offsets(1) + Blocks(1)*( Disp(2)-Disp(1) )
    Offsets(2) = Offsets(1) + ( Disp(2)-Disp(1) )
    Disp(1) = Disp(2)
    CALL MPI_GET_ADDRESS( Structs(1)%B(1), Disp(2), IError )
    !Offsets(3) = Offsets(2) + Blocks(2)*( Disp(2)-Disp(1) )
    Offsets(3) = Offsets(2) + ( Disp(2)-Disp(1) )

    CALL MPI_TYPE_CREATE_STRUCT( 3, Blocks, Offsets, Types, New_Type, IError )
    CALL MPI_TYPE_COMMIT( New_Type, IError )

    CALL MPI_TYPE_GET_EXTENT(New_Type, Extent, IError)
    CALL MPI_TYPE_SIZE(New_Type, Size, IError)
    IF (MYID.EQ.0) THEN
        WRITE(*,*) 'New_Type extents = ', Extent
        WRITE(*,*) 'New_Type size = ', Size
    END IF

    CALL MPI_GET_ADDRESS( Structs(1)%N, ElmOffset(1), IError )
    CALL MPI_GET_ADDRESS( Structs(2)%N, ElmOffset(2), IError )
    CALL MPI_GET_ADDRESS( Structs(3)%N, ElmOffset(3), IError )
    ElmOffset=ElmOffset - ElmOffset(1)

    IF (MYID.EQ.0) THEN
        WRITE(*,*) MYID,ElmOffset
    END IF

    CALL MPI_TYPE_CREATE_HINDEXED_BLOCK( 3, 1, ElmOffset, New_Type, Send_Type, IError )
    CALL MPI_TYPE_COMMIT( Send_Type, IError )

    CALL MPI_TYPE_GET_EXTENT( Send_Type, Extent, IError )
    CALL MPI_TYPE_SIZE( Send_Type, Size, IError )

    IF (MYID.EQ.0) THEN
        WRITE(*,*) 'Send_Type extents = ', Extent
        WRITE(*,*) 'Send_Type size = ', Size
    END IF


    IF (MYID.EQ.0) THEN
        DO I=1,(NUMPROCS-1)
            CALL MPI_SEND( Structs(1)%N, 1, Send_Type, I, 0, MPI_COMM_WORLD, IError)
        END DO       
    ELSE
        CALL MPI_RECV( Structs(1)%N, 1, Send_Type, 0, 0, MPI_COMM_WORLD, Status, IError)

    END IF

    WRITE( (MYID+10),*) Structs(1)%N, Structs(1)%A
    WRITE( (MYID+10),*) Structs(1)%B(1), Structs(1)%B(2)

    WRITE( (MYID+100),*) Structs(3)%N, Structs(3)%A
    WRITE( (MYID+100),*) Structs(3)%B(1), Structs(3)%B(2)

    CALL MPI_TYPE_FREE(Send_Type,IError)
    CALL MPI_TYPE_FREE(New_Type,IError)
    CALL MPI_FINALIZE(IError)

END PROGRAM MAIN

I changed REAL :: A to REAL*8 :: A to remove a warning on line Structs(I)%A = 250.0_8 + MYID*1.0 about double to float conversion. As noticed by Hristo Iliev, it is consistent with the new datatype which uses MPI_DOUBLE_PRECISION.



回答2:

The proper way to implement what you want is as follows.

1) Create a structured datatype that represents one record.

CALL MPI_GET_ADDRESS(Structs(1)%N,    POS_(1), IError)
CALL MPI_GET_ADDRESS(Structs(1)%A,    POS_(2), IError)
CALL MPI_GET_ADDRESS(Structs(1)%B(1), POS_(3), IError)
Offsets = POS_ - POS_(1)

Types(1) = MPI_INTEGER
Types(2) = MPI_REAL
Types(3) = MPI_REAL

Blocks(1) = 1
Blocks(2) = 1
Blocks(3) = 2

CALL MPI_TYPE_CREATE_STRUCT(3, Blocks, Offsets, Types, Elem_Type, IError)

This datatype can now be used to send one record of that structure:

CALL MPI_TYPE_COMMIT(Elem_Type, IError)
CALL MPI_SEND(Structs(1), 1, Elem_Type, ...)

2) To send more than one record, first resize the new datatype (force its extent to be of certain size) to match the true size of the structure. This is done to account for any padding that the compiler might insert at the end of the record.

CALL MPI_TYPE_GET_EXTENT(Elem_Type, Lb, Extent, IError)
CALL MPI_GET_ADDRESS(Structs(1)%N, POS_(1), IError)
CALL MPI_GET_ADDRESS(Structs(2)%N, POS_(2), IError)
Extent = POS_(2) - POS_(1)
CALL MPI_TYPE_CREATE_RESIZED(Elem_Type, Lb, Extent, ElemSized_Type, IError)

3) You can now use the new datatype to send multiple records of the structure:

CALL MPI_TYPE_COMMIT(ElemSized_Type, IError)
CALL MPI_SEND(Structs(1), 3, ElemSized_Type, ...)

Alternatively, you can create a contiguous datatype that covers three elements at once:

CALL MPI_TYPE_CONTIGUOUS(3, ElemSized_Type, BunchOfElements_Type, IError)
CALL MPI_TYPE_COMMMIT(BunchOfElements_Type, IError)
CALL MPI_SEND(Structs(1), 1, BunchOfElements_Type, ...)

Note: It is not necessary to commit datatypes that are not used in communication or I/O operations.