Memory Problems: Use-after-delete, etc
CS 301 Lecture, Dr. Lawlor
Unfortunately, it's easy to write bad code that manipulates memory
incorrectly, and then crashes. Worse, the crash can occurs
*hours* after the bad code has screwed up memory, during which the code
might appear to work properly!
Memory Leak
In assembly, you've seen how every single stack allocation (such as a push) must be matched by a deallocation (such as a pop). Similarly, every call to malloc or new must be matched by a call to free or delete (repectively!).
In a short-running program, you can actually survive if you malloc some
space that you never free. This is called a "memory leak".
When your program exits, the OS will clean up all your memory,
including leaked memory, and life goes on. But in a long-running
program, or a program the repeatedly allocates memory, failing to free
space will eventually cause malloc to run out of space. When
malloc is out of space to allocate, it returns NULL, and your program
usually crashes horribly (from the segfault when acessing NULL).
In C++, when out of memory space "new" throws an exception, and your
program usually crashes horribly (from the unexpected exception).
A memory leak is sort of like eating too much. It's bad, but if
you only do it on Thanksgiving, you'll be fine. It gets dangerous
if you eat too much day after day, because you'll soon run out of space
(e.g., for your organs).
Some languages, like Java and C#, use "garbage collection" to avoid
memory leaks. The nice part about garbage collection is you never
need to explicitly call "free". The not-so-nice part is your
program periodically has to stop, look through all of its memory, and
get rid of allocated blocks it's not using (the pause for garbage
collection). You can get a garbage collector for C++ too.
Use After Delete
One common problem is using a pointer after you no longer have the right to use the pointer:
int *A=new int[10];
A[7]=11;
delete[] A;
return A[7]; // aieeeee!
(executable NetRun link)
Again, in a perfect world this would segfault, but in practice it reliably returns 11.
The problem is that internally, "delete" just marks the buffer as free
for reuse, but doesn't actually change the value (except maybe the
first few elements).
So the next allocation to come along might actually get the exact same addresses, and overwrite the deleted array with its own stuff:
int *A=new int[10];
A[7]=11;
delete[] A;
int *B=new int[10];
B[7]=2222222;
std::cout<<"Array A: "<<A<<" and B: "<<B<<"\n";
return A[7];
(executable NetRun link)
Now the program returns 2222222, the B[7] value, because A == B!
The solution: don't call delete until you're really sure everybody's done
with that memory! Use-after-delete bugs are one big argument in favor
of garbage collection.
Use-after-delete can happen on the stack, too, where it's called
"return reference to temporary". The problem is that as soon as
your function returns, your local variables are released to the stack,
and the next function to be called will overwrite them. So if you
return a pointer to a local variable, you'll get weird inconsistent
behavior. The solution in this case is return a buffer object, or
array allocated with new or malloc, and have your caller delete the
thing when it's done.
Allocation/Deallocation Mismatch
It's really easy to forget what a pointer is pointing to, and call the wrong deallocation routine on it.
- Stack-allocated stuff deallocates itself when the function returns
- Pointers from malloc must be deallocated with free
- Pointers from new (NOT arrays) must be deallocated with delete
- Arrays from new (NOT pointers!) must be deallocated with delete[]
If you call free on a stack-allocated array, the best you can hope for is a quick crash:
int A[10];
A[7]=11;
free(A); // aieeee!
return 0;
(executable NetRun link)
(This crashes immediately at runtime with "free: invalid pointer". Nice!)
If you call free on a new'd array, your code is wrong, but might work anyway.
int *A=new int[10];
A[7]=11;
free(A); // aieeee!
return 0;
(executable NetRun link)
(This is WRONG, but on Linux, it appears to run fine... for now! It'll crash eventually, possibly way later.)
The really confusing one is new/delete and new[]/delete[]. If you
write a class that prints out its construction and destruction, like
this, then you can actually watch the construction and destruction
happen.
class ctortest {
public:
int value;
ctortest() { std::cout<<"You created an object at "<<this<<"\n";}
~ctortest() { std::cout<<"You deleted an object at "<<this<<"\n";}
};
This correct code calls three constructors, and three destructors:
ctortest *arr=new ctortest[3];
delete[] arr;
(executable NetRun link)
This incorrect code calls only ONE of the three destructors, because delete (no brackets) expects a pointer:
ctortest *arr=new ctortest[3];
delete arr;
(executable NetRun link)
This incorrect code allocates one object but calls SEVENTEEN destructors, because delete[] only works with pointers from new[]:
ctortest *arr=new ctortest;
delete[] arr;
(executable NetRun link)
This correct code calls one constructor and one destructor.
ctortest *arr=new ctortest;
delete arr;
(executable NetRun link)
If you use new, use delete.
If you use new[], use delete[].
It's that simple (in principle!).
Memory Corruption
All of the above are examples of memory corruption. This
is when the values in memory get "corrupted", or messed up, which
usually results in a segfault (eventually!) when some code accesses the messed-up data
structure.
One of the most common ways to corrupt memory is to allocate an n
element array, and access element [n] and beyond; a "write past the end
of the buffer". In a perfect world, the first such beyond-the-end
access would segfault, and you'd immediately find and fix the
problem. Sadly, this almost *never* happens, unless you access
way past the end of the buffer.
"malloc" (and new) store their housekeeping information past the end
of your arrays, so past-the-end accesses usually mess up the heap data
structures. Again, you might then get a crash when you try to
free (or delete) the offending array. Sadly, even that often that works fine too.
In fact, heap corruption usually shows up several allocations later, way past the original source of the problem!
int *arr=(int *)malloc(1023*sizeof(int));
int *bystander=(int *)malloc(3*sizeof(int));
std::cout<<"About to write past end of my buffer "<<arr<<"\n";
arr[1023]=5; //aieeeee!
std::cout<<"Well, that worked. Deleting my buffer "<<arr<<"\n";
free(arr);
std::cout<<"Also OK. Deleting totally unrelated buffer "<<bystander<<"\n";
free(bystander);
return 0;
(executable NetRun link)
This
delayed action makes heap corruption very tricky to find. In
particular, maybe the bad manipulation of "arr" is happening inside
"dumbguy.cpp", and the problem only shows up inside "bystander" inside
"mycode.cpp". So dumbguy's screwup causes my perfectly good code
to break!
The solution: NEVER access past the end of an array. Be careful
with array indexing, and add some index checking code if you're at all
in doubt!
On Linux, there's an awesome program called "valgrind"
that checks every memory access you make, and immediately prints out an
error if you access memory you shouldn't. It detects
past-the-end, before-the-start, use-after-delete,
allocation-deallocation mismatch, and also detects memory leaks.
You use it like "valgrind ./mycode.exe". There's a similar
commercial tool for Windows called purify.
Protected Memory
The one saving grace regarding memory problems is that they are
confined to a single execution of a program--because the OS constructs
your processes' entire memory image from scratch when your program
starts, and destroys it completely after your program exits, errors in your memory can only be caused by errors in your
program, and errors in your program cannot cause errors in other
processes' memory. In the bad old days of DOS, Windows 3.1, and MacOS
classic, before "protected memory", any program could trash any other
program's memory--so during debugging, if your code freaked out, you
might have to reboot your whole machine! On my old 1992 Mac, I learned
to get really paranoid about array indexing by suffering through
5-minute reboots every time I screwed up...