Dynamic Memory Allocation via malloc, and the stack

sizeof(dog_in_window)

Any time you're doing low level memory manipulation, you need to keep in mind how many bytes are used by an object. For example, to allocate memory, you need the size in bytes. To do pointer manipulation, like skipping over an object to find the next object, you need the size in bytes to know how far to skip.

Array type	First element	Move to next	Access i'th element
Array of char (string)	BYTE [ ptr ]	ptr+1	BYTE [ ptr + i ]
Array of integer	DWORD [ ptr ]	ptr+4	DWORD [ ptr + 4*i ]
Array of longs	QWORD [ ptr ]	ptr+8	QWORD [ ptr + 8*i ]

There's a handy keyword "sizeof" built into C and C++, that you can apply to any variable or type name, to find out how many bytes of storage it requires. For example, if you want to know if you're running on a 64-bit machine, where "long" is 64 bits or 8 bytes, you just print the size:

return sizeof(long);

(Try this in NetRun now!)

This returns 8 on my machines, indicating "long" takes 8 bytes.

This provides an interesting way to see inside objects, by measuring their size. For example, this secretive class is still just 8 bytes, indicating in memory it's just one "long" integer. You can even typecast a class pointer to a long int pointer, dereference the pointer, and there's the value. No real mystery!

class arcane_mystery {
private: long dark_secrets;
};

return sizeof(arcane_mystery);

(Try this in NetRun now!)

There isn't any equivalent to "sizeof" in assembly language--you just need to remember the sizes of everything yourself!

Dynamic allocation with malloc

malloc is the standard C way to allocate memory from "the heap", the area of memory where most of a program's stuff is stored. Unlike the C++ "new" operator, malloc doesn't explicitly know which data type it's allocating, since its only parameter is the number of bytes to allocate. This means you need an ugly pointer typecast on the weird bare pointer return value from malloc.

Plain C, also works in C++	C++
int arr=(int )malloc(100*sizeof(int));	int *arr=new int[100];
myClass c=(myClass )malloc(sizeof(myClass));	myClass *c=new myClass;

To free memory, call the function free, like "free(ptr);". As usual, you'll get a horrible crash, sometimes delayed and sometimes instant, if you free the same block more than once, free memory that didn't come from malloc, free memory and then somebody keeps using it, or many other misdeeds. If you forget to call free, you won't get an immediate crash, but in a long-running program like a network server, these un-freed objects will build up and eventually consume all the server's memory, causing it to slow down and eventually crash.

int len=3;
int *arr=(int *)malloc(len*sizeof(int));
int i;
for (i=0;i<len;i++)
	arr[i]=7+i;
iarray_print(arr,len);
free(arr);

(Try this in NetRun now!)

Unlike C++ "new", it's easy to call malloc from assembly. In fact, you just "call malloc", with the number of bytes to allocate in rdi, and you get back a pointer in rax. Here's a simple example where we allocate space for one long, write into the pointer returned by malloc, and immediately free it.

mov rdi,8  ; a byte count to allocate
extern malloc
call malloc
; rax is start of our array

mov QWORD[rax],3 ; yay writeable memory!

mov rdi,rax ; pointer to deallocate
extern free
call free

ret

(Try this in NetRun now!)

Here's a complete example, including looping and printing the array, equivalent to the above C code:

push rbx ; save register
mov rbx,3 ; number of integers (4-byte DWORDs) to allocate

mov rdi,rbx ; integer count
imul rdi,4  ; now a byte count
extern malloc
call malloc
; rax is start of our array

mov rcx,0 ; loop index
jmp loopTest
loopStart:
	mov rdx,7
	add rdx,rcx ; ==7+i
	mov DWORD[rax+4*rcx],edx ; arr[i]=7+i  (int version)

	add rcx,1
	loopTest:
	cmp rcx,rbx ; i<len
	jl loopStart

mov rdi,rax ; pointer to start of array
mov rsi,rbx ; number of integers
extern iarray_print
push rax ; save pointer
call iarray_print
pop rax ; restore pointer

mov rdi,rax; pointer to memory to free
extern free
call free

pop rbx ; restore register
ret

(Try this in NetRun now!)

Stack Allocation

We've been using the stack all semester, first in the form of the "call" and "ret" instructions, and later in the form of "push" and "pop" instructions. Internally, the stack is represented via a pointer stored in the register rsp. The general rules for use of this stack pointer register are:

Any memory address >= rsp is already in use: like a preserved register, you're not allowed to change this memory. If you do, the program is likely to crash, probably when you try to return.
Any memory address < rsp is free: like a scratch register, anybody can use this memory.
You must return the pointer rsp back to the value you found it.

This means you can actually manually allocate space on the stack, by:

Marking n bytes as in-use by subtracting* n from rsp: sub rsp,n
Releasing n bytes by adding to rsp: add rsp,n

Exactly like "call" corresponds to pushing the return address and jumping to the called function, and "ret" corresponds to popping the return address and jumping there, you can fake a push thingy by allocating 8 bytes of stack space and then mov QWORD [rsp], thingy, and you can fake a pop thingy by mov thingy, QWORD [rsp] and then release 8 bytes of stack space.

More generally, if you need to allocate dynamic storage for use during a single function call, the stack is an easy and efficient place to do it.

The only downside of stack allocation is you MUST release your stack space before you can return from your function. This means long-lived data structures, like a user info class used for the rest of the program, need to be allocated on the heap with malloc.

* It's quite weird that unlike everything else on the machine, the stack "grows down", allocating by subtracting from the pointer value. This is despite the most recent allocation being called the "top of the stack", when it's really the smallest address--this kinda makes sense if you think of writing out all the memory values at increasing addresses from top to bottom, but that's not usually how memory is shown! There are a few advantages to making the stack grow down. First, if you allocate an array of a few longs, the stack pointer also points to the start of the newly created array: the first one is stored in memory at QWORD [rsp], and the next one is at QWORD [rsp+8] (but if the stack grew upward, like everything else, you'd be pointing at the *end* of the array after allocating it by moving the pointer, requiring more error-prone pointer shuffling). Second, since the heap grows upward, toward larger pointer values, the program's entire memory is usually organized with the heap down low at one end of memory, and the stack up high near the other end of memory, and the unused memory in the middle can be traded off between stack and heap, whichever needs it.

Storage Area	Allocate	Deallocate
The Stack	sub rsp,n	add rsp,n
The Heap	malloc	free

CS 301 Lecture Note, 2014, Dr. Orion Lawlor, UAF Computer Science Department.