Friday, September 7, 2007

Debugging

Debugging is a major part of any coding project, large or small. Many a student has stayed up until the wee hours of the morning trying to figure out why their program churns out the wrong output. It can be a long, frustrating ordeal during the process but after the insight dawns or the bug reveals its source, it's like winning the lottery. Our backend code base is primarily in C and C++ and since we're a Linux shop, we do most of our debugging using a tool called gdb ("The GNU Debugger"). We employ a number of techniques to help us track down bugs and I thought I'd share a few with you.

Corefiles are very important. A core file (or core dump) is basically a nice snapshot of the state of the program after it's crashed. When you compile your program into its executable state, you can also add in debugging symbols, which gives the developer function names and variable names to work with when poring through the core file. We keep track of when core files are produced and how often, so we can track down the most common bugs. To use gdb to see the contents of the core file, you'll have to pass in the original binary and the corefile, e.g. gdb path/to/binary corefile and then you can go about tracing through the snapshot.

Sometimes you may want to debug a program as its running, and not after it has crashed. In that case you can employ different techniques.

To get a deeper look at the program's execution, you can also use gdb to "attach" to the running program. Attaching to a running process allows you to set breakpoints, look at backtraces, and do all the normal things you would do if you started the program using the debugger. Be careful though because gdb will stop the program when you attach, you'll have to use "continue" or detach (by exiting) to continue execution. To do this, you can use gdb path/to/binary 1234 (if 1234 is the process ID -- you can find this out by using the ps command). This technique is really only useful if you have debugging information compiled into your program (otherwise you'll see a lot of ???? where you'd expect to see function names =p), so just keep that in mind.

Another tool you can use is called strace. strace allows you to see the system calls (requests made to the operating system) your program is making. This is useful when perhaps your program seems to be taking up too much CPU or memory, and strace can attach to the running program without slowing it down too much. And, it works even if you don't have debugging symbols. The one downside is that you don't get to see any of your own calls.

One last tip is that sometimes you may not want to compile debugging information into your binary. In that case there is a way to have your cake and eat it too. Using strip (or eu-strip if your version of strip hasn't been updated) you can have your symbols in a different file than the normal binary. gdb will automatically load the symbols from the other file as long as it's in the right place. If you've moved it, you can pass --symbols=SYMFILE to let gdb know where to find it.

Hope that helps!

--
Warm regards
Saurabh Tiwari

No comments: