There’s no way I can possibly say everything about debugging in just one blog post, but I can certainly share a few useful tips.
First, let’s talk about what debugging fundamentally is. It’s the art of seeing what actually is, not what you meant or what you thought. It’s going to be uncomfortable, and if you get too tied up in your own ego you won’t be able to do it at all. Years ago one of my teachers at Camosun told us (I’m paraphrasing heavily here because I don’t remember the exact words) that there’s no point insisting you didn’t change anything. If it used to work and now it doesn’t, you obviously changed something. Just accept that you broke it and start trying to fix the problem.
One of the first things you need to do when you’re debugging is to make sure you can reproduce the problem reliably. If you can’t do that, then you don’t really know what the problem is (or you’ve got some sort of unholy race condition bug and you’re beyond my help :) ). If you’re not the one who found the bug, ask the person who did if they can show you, or ask for more information if it came through a helpdesk and you don’t have direct access to the user who found the bug.
Once you can reproduce the problem, you’ll be able to track down what’s going wrong and figure out whether a fix actually… fixes the problem. The first step I recommend after reproducing the issue is double checking all of your inputs, even the most stupid simple stuff you’re sure you couldn’t possibly have gotten wrong. Last week I thought I had broken staging when I actually just hadn’t chosen the right value in a dropdown box. Garbage in, garbage out, as they say.
After that, you’re going to be very tempted to just read through your code and hope you can spot the problem. Resist this temptation! I’m always pretty sure I know where the problem is or that it’ll jump out at me right away, but it almost never does unless it’s an extremely simple bug. Read over the code once if you really want to, but then move on to narrowing down exactly where the bug is. Believe me, it’s faster than staring blankly at your code and feeling dumb.
If you have a particularly chatty log, you may be able to start narrowing things down while you reproduce the issue. Start looking after the last log message you see before the bug happens. If you’re very lucky something will be obviously wrong close to the log line you were looking for.
If you’re not quite as lucky, you’re going to need to run the code locally and start commenting things out. Assuming you have some idea where the problem is happening, start dividing the method it could be in, in half. Either comment half of it out or add a log line half way through and see if you see that log line before or after you reproduce the problem. Keep narrowing it down until you know exactly where the problem is. Once you know exactly where the problem is, you should now know what’s going wrong if not exactly why. It’s not unusual to have to trace back through your code to find the place that set up the issue that wasn’t triggered until later. Bad config, for example, may not cause an actual bug until long after it’s saved.
The most important things you can remember when you’re debugging are to be systematic and to not make assumptions. Don’t assume that your input is good. Don’t assume that a certain piece of code can’t be the problem because it hasn’t been changed in ages/just passed testing/doesn’t seem to be related. Don’t assume that your config is what you think it is. Don’t assume that you know what’s going on – if you did, you wouldn’t have written a bug in the first place :)
2 Comments
Bob Warwick
I’m a native code guy, so maybe it’s easier in my world, but I can’t suggest a good debugger enough.
The formula I teach is to put in breakpoints, make predictions, and check those predictions against reality.
Mel Reams
Yes! I firmly believe in making the computer do as much work for you as possible. I like the making predictions and testing them idea, that sounds a lot more efficient than the way I just throw a breakpoint in at the beginning and follow the code through all sorts of irrelevant stuff before I finally stumble over the bug.