If I have an MPI program that I want to debug with gdb while being able to see all of the separate processes' outputs, I can use:
mpirun -n <NP> xterm -hold -e gdb -ex run --args ./program [arg1] [arg2] [...]
which is well and good when I have a GUI to play with. But that is not always the case.
Is there a similar set up I can use with screen
such that each process gets its own window? This would be useful for debugging in a remote environment since it would allow me to flip between outputs using Ctrl+a n
.
I think this answer in the "How do I debug an MPI program?" thread does what you want.
EDITS:
In response to the comment, you can do it somewhat more easily, although succinct isnt exactly the term I would use:
Launch a detached screen via mpirun - running your debugger and process. I've called the session mpi, and im passing through my library path because it gets stripped by screen and my demo needs it (also I'm on a mac, hence lldb and DYLD):
mpirun -np 4 screen -AdmS mpi env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out
Then launch a seperate screen session, which i've called 'debug':
screen -AdmS debug
Use screen -ls
to list the running sessions:
>> screen -ls
There are screens on:
19871.mpi (Detached)
19872.mpi (Detached)
19875.mpi (Detached)
19876.mpi (Detached)
20105.debug (Detached)
Now launch 4 new tabs in the debug session, attaching each to one of the mpi sessions:
screen -S debug -X screen -t tab0 screen -r 19871.mpi
screen -S debug -X screen -t tab1 screen -r 19872.mpi
screen -S debug -X screen -t tab2 screen -r 19875.mpi
screen -S debug -X screen -t tab3 screen -r 19876.mpi
Then simply attach to your debug session with screen -r debug
. Now you have 4 tabs, each running a serial instance of the debugger attached to an mpi process similarly to the xterm method you described before. Its not exactly the quickest set of commands, but at least you dont need to modify your code or chase PIDs etc.
Another method I tried, but doesnt seem to work:
Launch a detached screen
screen -AdmS ashell
Launch two mpi processes that start new screen tabs in the detached session, launching lldb with my demo mpi application:
mpirun -np 1 screen -S ashell -X screen -t tab1 env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out : -np 1 screen -S ashell -X screen -t tab2 env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out
Or alternatively just
mpirun -np 2 screen -S ashell -X screen env DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH lldb demo.out
Then attach to screen with
screen -r ashell
And you'll have 3 tabs, 2 of them running lldb with your program, and one with whatever your standard shell is. Unfortunately when you try running the programs, each process thinks its the only one in the comm world, and im not sure what to do about that...
How do you debug a C/C++ MPI program?
One way is to start a separate terminal and gdb session for each of the
processes:
mpirun -n <NP> xterm -hold -e gdb -ex run --args ./program [arg1] [arg2] [...]
where NP
is the number of processes.
What if you don't have a GUI handy?
(See below for a handy script.)
This is based on timofiend's answer here.
Spin up the mpi program in its debugger in a number of screen sessions:
mpirun -np 4 screen -AdmS mpi gdb ./parallel_pit_fill.exe one retain ./beauford.tif 500 500
Spin up a new screen session to access the debugger:
screen -AdmS debug
Load the debugger's screen sessions in to the new screen session
screen -list | #Get list of screen sessions
grep -E "[0-9]+.mpi" | #Extract the relevant ones
awk '{print NR-1,$1}' | #Generate tab #s and session ids, drop rest of the string
xargs -n 2 sh -c '
screen -S debug -X screen -t tab$0 screen -r $1
'
Jump into the new screen session:
screen -r debug
I've encapsulated the above in a handy script:
#!/bin/bash
if [ $# -lt 2 ]
then
echo "Parallel Debugger Syntax: $0 <NP> <PROGRAM> [arg1] [arg2] [...]"
exit 1
fi
the_time=`date +%s` #Use this so we can run multiple debugging sessions at once
#(assumes we are only starting one per second)
#The first argument is the number of processes. Everything else is what we want
#to run. Make a new mpi screen for each process.
mpirun -np $1 screen -AdmS ${the_time}.mpi gdb "${@:2}"
#Create a new screen for debugging from
screen -AdmS ${the_time}.debug
#The following are used for loading the debuggers into the debugging screen
firstpart="screen -S ${the_time}.debug"
secondpart=' -X screen -t tab$0 screen -r $1'
screen -list | #Get list of mpi screens
grep -E "[0-9]+.${the_time}.mpi" | #Extract the relevant ones
awk '{print NR-1,$1}' | #Generate tab #s and session ids, drop rest of the string
xargs -n 2 sh -c "$firstpart$secondpart"
screen -r ${the_time}.debug #Enter debugging screen