- An Overview of Debugging Python
- Michael Hudson's backtrace.py tool
An Overview of Debugging Python
Some types of bugs can be difficult to debug from within Python. Some include:
- segfaults (not uncaught Python exceptions)
hung processes (in cases where you can't get a Python traceback or debug with pdb)
- out of control daemon processes
In these cases, C level debugging with gdb can be helpful (it may be the only way to find out what is going on in some cases). To gather the information, the following steps need to be performed:
- get a Python interpreter with debugging symbols
- install Python specific GDB macros
- run the program under GDB / attach to already running process.
- obtain backtrace.
Even if the information obtained doesn't make sense to you, it may be able to help someone else track down the problem. If you are trying to track down an intermittent problem, perform steps 1 and 2 right away and the last steps when the problem occurs.
Ubuntu Dapper provides detached debugging symbols in the python2.4-dbg package:
sudo apt-get install python2.4-dbg
A set of GDB macros are distributed with Python that aid in debugging the Python process. You can install them by copying this script to ~/.gdbinit (or if the file already exists, by appending to it).
Note that the new GDB commands this file adds will only work correctly if debugging symbols are available.
Attaching GDB To Python
There are two ways to attach gdb to a Python process:
- run the program under gdb from the start, wait for the problem
- attach to the running Python process.
To run under gdb from the start, run the following commands:
$ gdb python ... (gdb) run <programname>.py <arguments>
This will run the program til it exits, segfaults or you manually stop execution (using ctrl+C).
If the process is already running, you can attach to it provided you know the process ID.
$ gdb python <pid of running process
Attaching to a running process like this will cause it to stop. You can tell it to continue running
Getting a Stack Trace
If you are debugging a segfault, this is probably the first thing you want to do.
At the (gdb) prompt, just run the following command:
(gdb) bt #0 0x0000002a95b3b705 in raise () from /lib/libc.so.6 #1 0x0000002a95b3ce8e in abort () from /lib/libc.so.6 #2 0x00000000004c164f in posix_abort (self=0x0, noargs=0x0) at ../Modules/posixmodule.c:7158 #3 0x0000000000489fac in call_function (pp_stack=0x7fbffff110, oparg=0) at ../Python/ceval.c:3531 #4 0x0000000000485fc2 in PyEval_EvalFrame (f=0x66ccd8) at ../Python/ceval.c:2163 ...
With luck, this will give some idea of where the problem is occurring and if it doesn't help you fix the problem, it can help someone else track down the problem.
The quality of the results will depend greatly on the amount of debug information available.
Getting a Stack Trace with Multiple Threads (such as Launchpad uses)
When working with multiple threads, as we do in both Launchpad and the LP test suite, you need to apply the backtrace or bt command to all threads in the process. If you do not do this, you will only see the main process, which is of little value.
Here is how you get a traceback of all running threads:
(gdb) thread apply all backtrace Thread 2 (Thread 0x7faf0de0f710 (LWP 2251)): #0 0x00007faf24a5147d in read () from /lib/libc.so.6 #1 0x00007faf249ec348 in _IO_file_underflow () from /lib/libc.so.6 #2 0x00007faf249edeee in _IO_default_uflow () from /lib/libc.so.6 ... Thread 1 (Thread 0x7faf25799700 (LWP 1800)): #0 0x00007faf24a57fb3 in select () from /lib/libc.so.6 #1 0x00007faf247754a9 in floatsleep (self=<value optimised out>, args=<value optimised out>) at /build/buildd/python2.5-2.5.5/Modules/timemodule.c:910 #2 time_sleep (self=<value optimised out>, args=<value optimised out>) at /build/buildd/python2.5-2.5.5/Modules/timemodule.c:206
Working With Hung Processes
If a process appears hung, it will either be waiting on something (a lock, IO, etc), or be in a busy loop somewhere. In either case, attaching to the process and getting a back trace can help.
If the process is in a busy loop, you may want to continue execution for a bit (using the cont command), then break (ctrl+C) again and bring up a stack trace.
Getting Python Stack Traces From GDB
NOTE: Initial reports say these macros just hang. If you know the conditions that will get them to work, please let us know! -- mars 2010-04-27 13:52:27
At the gdb prompt, you can get a Python stack trace:
Alternatively, you can get a list of the Python locals along with each stack frame:
Michael Hudson's backtrace.py tool
Michael Hudson wrote a useful little tool for pulling a Python stack trace from a running Python process, similar to the pystack GDB Macro mentioned above. This is useful to see where exactly a Python program died or hung. (See this thread for a live use case.)
Using the tool
To use the tool:
Make sure you have the python2.X-dbg package installed on the system (X is the Python version of the running process)
You can install this even while your Python program is running or hung. --mars
Get Michael's pygdb branch locally: bzr branch lp:pygdb, for instance.
cd into that branch. (Alternatively, make the pygdb package in that branch available to your Python in whatever other way you want).
Run python backtrace.py $pid to look at a process, or python backtrace.py -c $core to look at a core dump.
Some sample output with interpretive comments
Here is some sample output from a hung test suite, with useful comments for interpreting the results mixed in (thanks to Max Bowsher for those):
ec2test@ip-10-195-162-31:~/pygdb$ python backtrace.py 14620 Thread 3 #0 0x00002b8523165dc2 in select () from None #1 0x00002b8527a402c3 in select_select (self=<value optimized out>, args=<value optimized out>) from /build/buildd/python2.5-2.5.2/Modules/selectmodule.c /usr/lib/python2.5/asyncore.py (104): poll /usr/lib/python2.5/asyncore.py (181): loop /var/launchpad/tmp/eggs/lazr.smtptest-1.1-py2.5.egg/lazr/smtptest/server.py (107): start /usr/lib/python2.5/threading.py (445): run /usr/lib/python2.5/threading.py (469): __bootstrap_inner /usr/lib/python2.5/threading.py (461): __bootstrap ### maxb says: The lazr.smtptest thread is a daemon thread AFAIK, so should be ### ignorable for the purposes of this debugging. Thread 2 #0 0x00002b85227fd7fb in accept () from None #1 0x00002b852388f947 in sock_accept (s=0x94409c0) from /build/buildd/python2.5-2.5.2/Modules/socketmodule.c /usr/lib/python2.5/socket.py (167): accept /usr/lib/python2.5/SocketServer.py (374): get_request /usr/lib/python2.5/SocketServer.py (216): handle_request /var/launchpad/tmp/eggs/windmill-1.3beta3_lp_r1440- py2.5.egg/windmill/server/https.py (394): start /usr/lib/python2.5/threading.py (445): run /usr/lib/python2.5/threading.py (469): __bootstrap_inner /usr/lib/python2.5/threading.py (461): __bootstrap ### maxb says: This must be the culprit of the hang, it appears similar to one ### I've been looking at for the Python 2.6 migration. Whatever was supposed ### to knock this thread out of its accept loop, hasn't. Thread 1 #0 0x00002b85227fc991 in sem_wait () from None #1 0x00000000004b371d in PyThread_acquire_lock (lock=0xc220e90, waitflag=1) from ../Python/thread_pthread.h #2 0x00000000004b68d0 in lock_PyThread_acquire_lock (self=0x11de38d0, args=<value optimized out>) from ../Modules/threadmodule.c /usr/lib/python2.5/threading.py (208): wait /usr/lib/python2.5/threading.py (580): join /usr/lib/python2.5/threading.py (682): _exitfunc ### maxb says: This is the main thread calling threading._shutdown to wait for ### non-daemon non-main threads to exit. We can ignore it.