Use gdb to debug MPI in multiple screen windows

2020-06-23 09:58发布

问题:

If I have an MPI program that I want to debug with gdb while being able to see all of the separate processes' outputs, I can use:

mpirun -n <NP> xterm -hold -e gdb -ex run --args ./program [arg1] [arg2] [...]

which is well and good when I have a GUI to play with. But that is not always the case.

Is there a similar set up I can use with screen such that each process gets its own window? This would be useful for debugging in a remote environment since it would allow me to flip between outputs using Ctrl+a n.

回答1:

I think this answer in the "How do I debug an MPI program?" thread does what you want.


EDITS:

In response to the comment, you can do it somewhat more easily, although succinct isnt exactly the term I would use:

Launch a detached screen via mpirun - running your debugger and process. I've called the session mpi, and im passing through my library path because it gets stripped by screen and my demo needs it (also I'm on a mac, hence lldb and DYLD):

mpirun -np 4 screen -AdmS mpi env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out

Then launch a seperate screen session, which i've called 'debug':

screen -AdmS debug

Use screen -ls to list the running sessions:

>> screen -ls

There are screens on:

19871.mpi (Detached)

19872.mpi (Detached)

19875.mpi (Detached)

19876.mpi (Detached)

20105.debug (Detached)

Now launch 4 new tabs in the debug session, attaching each to one of the mpi sessions:

screen -S debug -X screen -t tab0 screen -r 19871.mpi
screen -S debug -X screen -t tab1 screen -r 19872.mpi
screen -S debug -X screen -t tab2 screen -r 19875.mpi
screen -S debug -X screen -t tab3 screen -r 19876.mpi

Then simply attach to your debug session with screen -r debug. Now you have 4 tabs, each running a serial instance of the debugger attached to an mpi process similarly to the xterm method you described before. Its not exactly the quickest set of commands, but at least you dont need to modify your code or chase PIDs etc.


Another method I tried, but doesnt seem to work:

Launch a detached screen

screen -AdmS ashell

Launch two mpi processes that start new screen tabs in the detached session, launching lldb with my demo mpi application:

mpirun -np 1 screen -S ashell -X screen -t tab1 env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out : -np 1 screen -S ashell -X screen -t tab2 env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out

Or alternatively just

mpirun -np 2 screen -S ashell -X screen env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out

Then attach to screen with

screen -r ashell 

And you'll have 3 tabs, 2 of them running lldb with your program, and one with whatever your standard shell is. Unfortunately when you try running the programs, each process thinks its the only one in the comm world, and im not sure what to do about that...



回答2:

How do you debug a C/C++ MPI program?

One way is to start a separate terminal and gdb session for each of the processes:

mpirun -n <NP> xterm -hold -e gdb -ex run --args ./program [arg1] [arg2] [...]

where NP is the number of processes.

What if you don't have a GUI handy?

(See below for a handy script.)

This is based on timofiend's answer here.

Spin up the mpi program in its debugger in a number of screen sessions:

mpirun -np 4 screen -AdmS mpi gdb ./parallel_pit_fill.exe one retain ./beauford.tif 500 500

Spin up a new screen session to access the debugger:

screen -AdmS debug

Load the debugger's screen sessions in to the new screen session

screen -list |                   #Get list of screen sessions
   grep -E "[0-9]+.mpi"  |       #Extract the relevant ones
   awk '{print NR-1,$1}' |       #Generate tab #s and session ids, drop rest of the string
   xargs -n 2 sh -c '
     screen -S debug -X screen -t tab$0 screen -r $1
   '

Jump into the new screen session:

screen -r debug

I've encapsulated the above in a handy script:

#!/bin/bash
if [ $# -lt 2 ]
then
  echo "Parallel Debugger Syntax: $0 <NP> <PROGRAM> [arg1] [arg2] [...]"
  exit 1
fi

the_time=`date +%s` #Use this so we can run multiple debugging sessions at once
                    #(assumes we are only starting one per second)

#The first argument is the number of processes. Everything else is what we want
#to run. Make a new mpi screen for each process.
mpirun -np $1 screen -AdmS ${the_time}.mpi gdb "${@:2}"

#Create a new screen for debugging from
screen -AdmS ${the_time}.debug

#The following are used for loading the debuggers into the debugging screen
firstpart="screen -S ${the_time}.debug"
secondpart=' -X screen -t tab$0 screen -r $1'

screen -list |                         #Get list of mpi screens
   grep -E "[0-9]+.${the_time}.mpi"  | #Extract the relevant ones
   awk '{print NR-1,$1}' |             #Generate tab #s and session ids, drop rest of the string
   xargs -n 2 sh -c "$firstpart$secondpart"

screen -r ${the_time}.debug            #Enter debugging screen