Debugging

Note

Also see Profiling.

Debugging is one of the most fundamental things you can do while using software: debuggers allow you to see inside of running programs, and this is a requirement of developing with any software. Any reasonable programming language will have a debugger made as one of the first tasks when it is being created.

Serial code debugging

GDB is the usual GNU debugger.

Note: the latest version of gcc/gfortran available through module require -gdwarf-2 option along with the -g to get it to work with the default gdb command. Otherwise the default version 4.4 should work normally with just -g.

Valgrind is another tool that helps you to debug and profile your serial code on Triton.

MPI debugging & profiling

GDB with the MPI code

Compile your MPI app with -g, run GDB for every single MPI rank with:

salloc -­p play ­­--nodes 1 ­­--ntasks 4 srun xterm ­-e gdb mpi_app

You should get 4 xterm windows to follow, from now on you have full control of you MPI app with the serial debugger.

PADB

A Parallel Debugging Tool. Works on top of SLURM, support OpenMPI or MPICH only (as of June 2015), that is MVAPICH2 is not supported. Do not require code re-compilation, just run your MPI code normally, and then launch padb separately to analyze the code behavior.

Usage summary (for full list and explanations please consult http://padb.pittman.org.uk/):

# assume you have your openmpi module loaded already
module load padb
padb --create-secret-file    # for the very first time only

# Show all your current active jobs in the SLURM queue
padb -show-jobs

# Target a specific jobid, and reports its process state
padb  --proc-summary
# or, for all running jobs
padb --all --proc-summary

# Target a specific jobid, and report its MPI message queue, stack traceback, etc.
padb --full-report=

# Target a specific jobid, and report its stack trace for a given MPI process (rank)
padb  --stack-trace --tree --rank

# Target a specific jobid, and report its stack trace including information about parameters and local variables for a given MPI process (rank)
padb  --stack-trace --tree --rank  -Ostack-shows-locals=1 -Ostack-
shows-params=1

# Target a specific jobid, and reports its MPI message queues
padb  --mpi-queue

# Target a specific jobid, and report its MPI process progress (queries in loop over and over again)
padb  --mpi-watch --watch -Owatch-clears-screen=no