How do I debug an MPI program?

2020-01-24 03:39发布

I have an MPI program which compiles and runs, but I would like to step through it to make sure nothing bizarre is happening. Ideally, I would like a simple way to attach GDB to any particular process, but I'm not really sure whether that's possible or how to do it. An alternative would be having each process write debug output to a separate log file, but this doesn't really give the same freedom as a debugger.

Are there better approaches? How do you debug MPI programs?

标签: debugging mpi
16条回答
Fickle 薄情
2楼-- · 2020-01-24 04:00

I use this little homebrewn method to attach debugger to MPI processes - call the following function, DebugWait(), right after MPI_Init() in your code. Now while the processes are waiting for keyboard input, you have all the time to attach the debugger to them and add breakpoints. When you are done, provide a single character input and you are ready to go.

static void DebugWait(int rank) {
    char    a;

    if(rank == 0) {
        scanf("%c", &a);
        printf("%d: Starting now\n", rank);
    } 

    MPI_Bcast(&a, 1, MPI_BYTE, 0, MPI_COMM_WORLD);
    printf("%d: Starting now\n", rank);
}

Of course you would want to compile this function for debug builds only.

查看更多
时光不老,我们不散
3楼-- · 2020-01-24 04:01

If you are a tmux user you will feel very comfortable using the script of Benedikt Morbach: tmpi

Original source: https://github.com/moben/scripts/blob/master/tmpi

Fork: https://github.com/Azrael3000/tmpi

With it you have multiple panels (number of processes) all synchronized (every command is copied on all panels or processes at the same time so you save lot of time comparing with the xterm -e approach). Moreover you can know the variables' values in the process you want just doing a print without having to move to another panel, this will print on each panel the values of the variable for each process.

If you are not a tmux user I recommend strongly to try it and see.

查看更多
来,给爷笑一个
4楼-- · 2020-01-24 04:05

Another solution is to run your code within SMPI, the simulated MPI. That's an open source project in which I'm involved. Every MPI rank will be converted into threads of the same UNIX process. You can then easily use gdb to step the MPI ranks.

SMPI proposes other advantages to the study of MPI applications: clairevoyance (you can observe every parts of the system), reproducibility (several runs lead to the exact same behavior unless you specify so), absence of heisenbugs (as the simulated platform is kept different from the host one), etc.

For more information, see this presentation, or that related answer.

查看更多
登录 后发表回答