Memory Allocation Using 'malloc'

CS 301 Lecture, Dr. Lawlor

Dynamic allocation in C++

There are many ways to allocate space for an array in C or C++:
There are also many ways to do this in assembly. 

Stack Allocation

We've seen you can also allocate space on the stack, by just moving the stack pointer down to make room:
sub esp,4*10; Make space on the stack
mov eax, esp; Point eax to the start of that space
... ; Use eax as array here
add esp,4*10; Restore stack
C/C++ of course can't directly access the stack pointer, but there's often a compiler-builtin function called "alloca" (which looks suspiciously like malloc!) that allocates a user-defined amount of space on the stack.  Unlike malloc, memory allocated with alloca is automatically freed when the function returns.  "alloca" may not exist on all machines, since some machines have rather funky stacks.

Malloc and Free

Other than the stack, the most common memory allocation method is to call malloc and free.  "malloc" is a C function that takes one parameter, the number of bytes to allocate, and returns a pointer to the allocated memory.  You can then use the pointer like an array, or struct, or whatever you like.  "free" takes one parameter, the pointer to the allocated memory to free (which has to come from malloc--you'll get a bizarre crash if you free a stack pointer!).  Here's how you call "malloc" and "free" in assembly:
push 4*10; Number of bytes to allocate
extern malloc
call malloc
add esp,4; Undo "push"
; Malloc's return value, a pointer, comes back in eax.
... ; use eax as array here
; Eventually, you'll need to call "free" to release that memory:
push eax
extern free
call free
add esp,4; Undo "push"
(executable NetRun link)

Fixed Allocation

Finally, you can allocate space in the initialized (".data") or uninitialized (".bss") section of your executable file:
mov eax,thingy; Point eax to thingy's data
...

section .bss; <- Uninitialized (but writeable!) data area
thingy:
resb 4*10; Reserve this many bytes of space here

Any of these allocation schemes will work, and the pointer you end up with is just plain old memory.  But there are tradeoffs in deciding which memory allocation scheme to use.  Malloc is slow and a pain to use, but offers the most possible memory (gigabytes).  The stack is very fast, but must be given back at the end of your subroutine, and is limited to a few megabytes on many machines.  Initialized or uninitialized data is a fixed size, and always allocated; this is a waste of space unless you're storing small objects of a few kilobytes.

Allocation
Method
Allocation
Speed
Storage
Available
Assembly Example
Disadvantages
malloc/free
Slow (100ns)
Gigabytes
push 100
call malloc
Must remember to call free
stack or alloca
Fast (1ns)
Megabytes
sub esp,100
mov eax,esp
Must remember to move stack pointer back;
storage is limited to stack size;
alloca() not always available in C/C++
"section"
Not needed
(fixed allocation!)
Megabytes
mov eax,my_pointer
my_pointer:
    resb 100
Wastes space even when not in use;
usually not "threadsafe"

Bad Memory Access

One all-too-common problem in C++ is accessing beyond the end of the memory you're allowed to touch.  For example, this buggy program actually seems to run fine:
int n=read_input();
int *arr=(int *)malloc(n); /* OOPS! n bytes, not n ints! */
std::cout<<"Arr runs from "<<arr<<" to "<<arr+n<<"\n";
for (int i=0;i<n;i++) arr[i]=i;
iarray_print(arr,n);
std::cout<<"OK, done with arr--freeing it.\n";
free(arr);

(Try this in NetRun now!)

The problem is this trashes 3*n bytes of heap memory, but if you're "lucky" nobody's using those bytes, and your program survives.  It will crash eventually, though, usually after some other code is added/changed.  This code only crashes (when the guilty array is freed) once you add some innocent pointer manipulation elsewhere in the program:
int *innocent=(int *)malloc(sizeof(int));
*innocent=3;
std::cout<<"Innocent pointer at "<<innocent<<"\n";

int n=read_input();
int *arr=(int *)malloc(n); /* OOPS! n bytes, not n ints! */
std::cout<<"Arr runs from "<<arr<<" to "<<arr+n<<"\n";
for (int i=0;i<n;i++) arr[i]=i;
iarray_print(arr,n);
std::cout<<"OK, done with arr--freeing it.\n";
free(arr);

if (*innocent!=3) std::cout<<"Innocent bystander shot like moose!(page A3)\n";
std::cout<<"Freeing innocent bystander\n";
free(innocent);

(Try this in NetRun now!)

You can get an immediate crash on accessing off the end of your memory by linking with "electric fence", a cool little library by Bruce Perens.  Just link with "-lefence".  On Linux, valgrind is similar but doesn't even require re-linking.
int n=read_input();
int *arr=(int *)malloc(n); /* OOPS! n bytes, not n ints! */
std::cout<<"Arr runs from "<<arr<<" to "<<arr+n<<"\n";
for (int i=0;i<n;i++) arr[i]=i;
iarray_print(arr,n);
std::cout<<"OK, done with arr--freeing it.\n";
free(arr);

(Try this in NetRun now!)

If you're getting weird crashes in a big C/C++ program, it's quite possible that somebody's overwriting memory!  This is one place where the explicit array size checks built into Java or C# really shine--you can't corrupt the heap in those languages, because every array access verifies that the index is in-bounds.  Of course, those extra checks can take some time, so those languages can be a bit slower than C++.

Explict Array Size Checking in C++

You can actually check your array indices in C++, if you're willing to build your own array class:
class myArray {
int *arr;
int length;
public:
myArray(int n) {
arr=(int *)malloc(n*sizeof(int));
length=n;
}

int & operator[] (int index) {
if (index<0 || index>=length)
{ /* out of bounds! */
std::cout<<"I'm very sorry, but you've gone out of bounds.\n";
std::cout<<" Your index "<<index<<" is more than the array length "<<length<<".\n";
exit(1);
}
return arr[index];
}
};

int n=read_input();
myArray a(n);
a[1]=123;
a[n+1]=1234;

return a[1];

(Try this in NetRun now!)