10 essential debugging rules
This is based on the book Debugging: The Nine Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems by David J. Agans. David lists 9 rules for debugging based on his experience. I would encourage everyone to read it. It's short and easy read, and littered with war stories to make you smile.
I thought it would be worth summarizing the book for developers too damn lazy to read even a short book.
I added a tenth (debug the data) rule because it's been my experience that it's too easy to start looking at the code when the data is messed up - and it's an easy check to overlook.
Arguably some of the rules could be merged but I don't think it's worth doing. As they are now I think they are simple and unambiguous. Almost a cheat-sheet for debugging!
Debug the data
Check the data is what you, and the design, expect. Sure, it might mean writing a ton of code so you can verify the data but data only needs to be a "bit" wrong.
Consider building the ability to check the data into the system by providing a way to dump the data to plain text.
Understand the system
Read the manual, get the instructions out, check your code against sample code. Make sure you're not sending something a long when it's expecting a short - and while you're at it ensure that a short and a long are what you expect them to be.
Make it fail
Reproduce the problem. Hard to reproduce problems should be stimulated by shaking and baking. If something happens infrequently then use your best judgment to guess at what the variables might be and try to increase the frequency.
If your headphones have a crackle and you suspect a loose wire then wiggle the wire around to see if it increases the frequency of the crackle. Do the same in your software.
Don't assume something has no effect on your problem. Assumption can waste hours. Check and double check.
Quit thinking and look
Don't jump to conclusions based on too little information. Get stuck in and start stepping through the code and eliminate the unknowns.
Add debug log output to your software so if it does fail you might be better armed right from the moment the customer has a problem.
Use any debugging tools you can; from static code analysis through to single stepping in a debugger.
Divide and conquer
Narrow your search. Reduce the places a bug can hide. Use easy to spot test data where possible. Don't ignore bugs you find that are unrelated to the one you're actively trying to fix - one bug can hide the true symptoms of another. Remember, if you fix another bug be sure to go back and reproduce the first one again.
Change just one thing at a time
If you try to fix the bug and the fix doesn't work then take it out. If it doesn't fix your bug then you have to ask yourself the question "What does it do?" Remove it.
Try to figure out what has changed since the system worked last. Compare run logs for successful and failed runs.
Keep an audit
No matter how small keep a log of events when trying to track down a tricky bug. Even minor details can have a major impact and you may not be the best person to decide whether something is important to reproducing the bug.
Look for patterns in the audit log.
Check the obvious first
Don't assume your assumptions are correct. Question everything.
Is it a compiler bug, a runtime library bug, an operating system bug, a bug in the debugger or maybe even a bug in the test tool.
When looking at problems with memory allocation check that you did indeed allocate the right amount of memory, check that you initialized it.
Ask someone else
Ask a friend or colleague - often just explaining a problem helps to clarify your thoughts and focus you enough to realize where the problem is.
Asking an expert might yield an answer immediately. Ask in the usenet newsgroups and leave it for a day or so while you wait to see if anyone has any suggestions.
If you didn't fix it then it's not fixed
Bugs don't fix themselves so if you didn't fix it, and the bug can't now be reproduced then all you've done is make it harder to reproduce.
Check that it's really fixed by undoing your fix and then reproducing the fault, then reapply your fix and try to reproduce the fault once again.