Debugging and (Automated) Testing

Dr. Lawlor, CS 202, CS, UAF

It's surprisingly tough to tell when code is correct. For example, quick--what's wrong with this code that causes it to infinite loop? (Answers at the bottom of this page.)

const int n=10;
int sum=0;
for (int i=0;i<n;i++)
	for (int j=0;j<n;i++)
		sum+=i*j;
return sum;

(Try this in NetRun now!)

Or this linked list code, which segfaults?

class llnode {
public:
	int data;
	llnode *next;
	llnode(int v,llnode *n) {data=v; next=n;}
};
int total(llnode *cur,int sum)
{
	sum+=cur->data;
	return total(cur->next,sum);
}
llnode *make_list(void) {
	int x;
	if (cin>>x) { return new llnode(x,make_list()); }
	else { return NULL; }
}
int foo(void) {
	llnode *cur=make_list();
	return total(cur,0);
}

(Try this in NetRun now!)

Even code as simple as this can have bugs:

int x=0;
cin>>x;
return x;

(Try this in NetRun now!)

(For example, what if the user enters "3.999"? This consumes the "3", leaving the ".999" to trip up the next input.)

Debugging Strategies

OK, your code doesn't work. You are now debugging. What do you do?

The most efficient approach by far is the scientific method.

First, throw away your preconceptions. Don't blindly ASSUME that anything works until you've verified it. I've seen *multiple* compiler/machine combinations where code as simple as cout<<"Hello!" just segfaults horribly.

Second, formulate a HYPOTHESIS about what's broken (for example, "It's gotta be foolib!" or "My compiler is borked" or "I'm not reading the file right"). If you can't come up with a reasonable hypothesis, you need to gather more data, via:

Debug statements, like "cout" or file output statements to a log file. This is probably overused, but it's a very powerful technique, because scrolling backward from the point of the crash takes you backward in time, letting you hone in on the error!
Using a debugger. The point of the crash is somewhat useful about half the time, although like in an airplane accident, the other half of the time the crash itself is usually several minutes after the real problem happened. For example, it's common to corrupt memory, or bork some internal data structure of a library, and still have the program lurch along for quite a while, as things go more and more wrong, until finally something comes totally unglued and crashes. Another problem with debuggers is Heisenbugs: errors that disappear when you look at them in a debugger. Debuggers change the timing and speed of the code, which can affect a variety things in strange ways.

Third, perform a CONTROLLED EXPERIMENT to determine if your hypothesis is correct. The control part is very important, and easy to leave out. One of the most useful possible controls is a previous version of the program--if it worked yesterday, and it doesn't work today, something (in your code OR outside it) has changed in the last day. Keeping previous working versions, both to run and to diff the code, is extremely valuable.

Fourth, and finally, you can start to fix the problem. This is much easier once you know exactly what the problem is!

Testing Strategies

Because debugging is so painful, it's much easier to catch bugs as they're written, rather than waiting until the last minute.

One testing scheme used at a lot of organization is the "nightly build": at midnight every day, an automated system checks out the current code from version control, builds it, and runs it through a battery of tests. Any errors are sent back to the developers (all of them, only those who just checked in code, or via a manual blame assignment process). "Breaking the build" is considered shameful, but it's much better to find these errors the next day, rather than a year later!

Software that has to run on multiple platforms (like Windows and MacOS) or multiple compilers (gcc and Intel compiler) is much more reliable with an automated build system. I built an automated build setup for Charm++, using a simple shell script that connects to various build machines using ssh (and a special automated build SSH key, used to avoid having to enter a password). Command-line applications are typically a lot easier to script for automatic testing, but it's actually possible to set up automated testing for GUI applications.

Answers to code bugs above:
- The "i++" in the inner loop should be a "j++".
- The "total" function is recursive, but lacks a base case to handle "cur==0" case.