Gathering strings with MPI_Gather openmpi c

2019-07-20 06:04发布

问题:

I want to generate an string with each process and then gather everything. But the strings created in each process are created by appending ints and chars.

I'm still not able to gather everything correctly. I can print all the partial strings one by one, but If I try to print the rcv_string, I only get one partial string or maybe a Segmentation Fault.

I've tried putting zeros at the end of strings with memset, reserving memory for the strings dynamically and statically, ... But I don't find the way.

It would be great if someone knew how to inizialize the strings and do the gather properly for achieving the objective.

int main(int argc, char *argv[]) {

    int rank;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    char *string;        // ????????????
    char *rcv_string;    // ????????????

    if (rank == 0)  {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 1) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 2) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 3) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 4) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 5) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }

    MPI_Gather(string,???,MPI_CHAR,rcv_string,???,MPI_CHAR,0,MPI_COMM_WORLD);

    if (rank == 0) {
        printf("%s",rcv_string);
    }

    MPI_Finalize();
    return 0;
}

回答1:

I managed to reproduce the incorrect behavior where only one partial string is printed.

It is related to your usage of sprintf.

How does C handle char arrays?

When working with arrays in C, you must first allocate memory for it. Dynamic or static, it doesn't matter. Suppose you allocate enough memory for 10 chars.

char my_string[10];

Without initializing it, it contains nonsense characters.

Let's pretend my_string contains "qwertyuiop".

Suppose you want to fill my_string with the string foo. You use sprintf.

sprintf(my_string, "foo");

How does C fill 10 slots with 3 characters?

It fills the first 3 slots with the 3 characters. Then, it fills the 4th slot with an "end of string" character. This is denoted by '\0', which is converted to an "end of string" character when it goes through the compiler.

So, after your command, my_string contains "foo\0tyuiop". If you print out my_string, C knows not to print out the nonsense characters after the \0.

How does this relate to MPI_Gather?

MPI_Gather collects arrays from different processes, and puts them all into one array on one process.

If you had "foo\0tyuiop" on process 0 and "bar\0ghjkl;" on process 1, they get combined into "foo\0tyuiopbar\0ghjkl;".

As you can see, the array from process 1 appears after the "end of line" character from process 0. C will treat all of the characters from process 1 as nonsense.

A patchy solution

Rather than attempting to print all of rcv_string at once, acknowledge that there are "end of string" characters scattered throughout. Then, print out strings with different "start of string" positions, according to the process it came from.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {

  int rank, size;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  int part_str_len = 18;

  char *my_string;
  char *rcv_string;

  if ((my_string = malloc(part_str_len*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }
  if ((rcv_string = malloc(part_str_len*size*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }

  sprintf(my_string, "%dr%dg%db%dl\n",255,255,255,0);

  MPI_Gather(my_string,18,MPI_CHAR,rcv_string,18,MPI_CHAR,0,MPI_COMM_WORLD);

  if (rank == 0) {
    printf("%s",rcv_string);
  }

  char *cat_string;
  if ((cat_string = malloc(part_str_len*size*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }

  if (rank == 0){
    int i;
    sprintf(cat_string, "%s", rcv_string);
    for (i = 1; i < size; i++){
      strcat(cat_string, &rcv_string[part_str_len*i]);
    }
  }

  if (rank == 0) {
    printf("%s",cat_string);
  }

  free(my_string);
  free(rcv_string);
  free(cat_string);

  MPI_Finalize();
  return 0;
}


回答2:

Try the following:

#define MAX_STR_LEN 100

int main(int argc, char *argv[]) {

    int rank, size;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    char string[MAX_STR_LEN] = "some string";

    char *rcv_string = NULL;
    if (rank == 0) {
        // Only the master needs to allocate the memory
        // for the result string which needs to be large
        // enough to contain the input strings from `size`
        // peers.
        rcv_string = malloc(MAX_STR_LEN * size);
    }

    ...same code...

    MPI_Gather(string, strlen(string), MPI_CHAR,
               rcv_string, MAX_STR_LEN, MPI_CHAR, 0, MPI_COMM_WORLD);

    if (rank == 0) {
        printf("%s",rcv_string);
        free(rcv_string);
    }

    MPI_Finalize();
    return 0;
}

Running this code with mpirun -n 5 ./a.out produces the following:

some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l

Make sure to define MAX_STR_LEN so that is big enough for your requirements. If the value grows to big you may want to consider heap allocation (i.e. malloc).



标签: c mpi openmpi