This question already has an answer here:
I'm trying to create a "Hello, world!" application in (Open)MPI such that each process will print out in order.
My idea was to have the first process send a message to the second when it's finished, then the second to the third, etc.:
#include <mpi.h>
#include <stdio.h>
int main(int argc,char **argv) {
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
// See:
if (rank == 0) {
// This is the first process.
// Print out immediately.
printf("Hello, World! I am rank %d of %d.\n", rank, size);
MPI_Send(&rank, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
} else {
// Wait until the previous one finishes.
int receivedData;
MPI_Recv(&receivedData, 1, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Hello, World! I am rank %d of %d (message: %d).\n", rank, size, receivedData);
if (rank + 1 < size) {
// We're not the last one. Send out a message.
MPI_Send(&rank, 1, MPI_INT, rank + 1, 0, MPI_COMM_WORLD);
} else {
printf("Hello world completed!\n");
return 0;
When I run this on an eight-core cluster, it runs perfectly every time. However, when I run it on a sixteen-core cluster, sometimes it works, and sometimes it outputs something like this:
Hello, world, I am rank 0 of 16.
Hello, world, I am rank 1 of 16 (message: 0).
Hello, world, I am rank 2 of 16 (message: 1).
Hello, world, I am rank 3 of 16 (message: 2).
Hello, world, I am rank 4 of 16 (message: 3).
Hello, world, I am rank 5 of 16 (message: 4).
Hello, world, I am rank 6 of 16 (message: 5).
Hello, world, I am rank 7 of 16 (message: 6).
Hello, world, I am rank 10 of 16 (message: 9).
Hello, world, I am rank 11 of 16 (message: 10).
Hello, world, I am rank 8 of 16 (message: 7).
Hello, world, I am rank 9 of 16 (message: 8).
Hello, world, I am rank 12 of 16 (message: 11).
Hello, world, I am rank 13 of 16 (message: 12).
Hello, world, I am rank 14 of 16 (message: 13).
Hello, world, I am rank 15 of 16 (message: 14).
Hello world completed!
That is, most of the output is in order, but some is out of place.
Why is this happening? How is it even possible? How can I fix it?
MPI codes are not guaranteed to complete in any specific order. This is especially true when running on multiple nodes, but still true even on one node.
While you are enforcing some sort of ordering by adding the sequential sends and receives, the output messages are still forwarded from the application process to the MPI layer and back up to the
process to be printed to the screen. This message forwarding can happen in any order and is interleaved with other communication (since it uses a completely different communication topology). If you really must ensure that messages are printed in order, you have to make sure that the same MPI rank prints all of them out.