Pointer Arithmetic and "Malloc"

CS 301 Lecture, Dr. Lawlor

Pointer Arithmetic in Assembly

Consider a little two-int array, allocated using "dd" (data dword):
mov eax,[myint+4]
ret

myint:
dd 3
dd 7

(Try this in NetRun now!)
Note that [myPtr] is int "0", while [myPtr+4] is int "1".  This means int "i" is at [myPtr+4*i].  Amazingly, x86 actually supports this very expression!  Which is good, because this style is very handy for accessing arrays.  It even works if myPtr and i are variables stored in registers.
extern read_input
call read_input

mov rdx,myint ; pointer to start of array
mov eax,[rdx+4*rax] ; access array element rax
ret

myint:
dd 3
dd 7

(Try this in NetRun now!)

Memory Allocation in General

Memory (like real estate) in theory could be used by anybody for anything at any time (the anarchist squatter's paradise!).  Of course, in practice, it works a lot better to set up rules by which you can figure out what memory's yours, and what isn't (e.g., deeds, leases, rental contracts).  So a piece of memory can be:

  1. Owned by you, and used by you.  This is the good kind of memory, the only kind you should be using.
  2. Owned by somebody else, and erroniously used by you.  You can read or write surprisingly far past the end of an array before crashing, although you can easily end up overwriting something used by some other part of the program, or crashing yourself.  (A confusing "memory corruption" error.)
  3. Owned by somebody else, and deadly to even look at.  You get a "segmentation fault" access violation if you access this pointer.  The CPU enforces the OS's wishes using the "page table", which you'll hear about in CS 321 (unless I tell you first!). 
You can't tell the difference between type 2 memory (dangerous, but not right now) and type 3 memory (immediate death), so stick to type 1!

The bottom line is you really need to claim memory before using it, and then only use the part you claimed.  It's easy to accidentally run off the end of an array (owned by you, class 1 memory) into other bytes of memory owned by some other part of the program (class 2 memory), or delete an array (so it's no longer owned by you) and use it later, etc.  Sadly, in C++ it's up to you the programmer to make sure your uses of memory are correct, unlike Java or C# where pointers aren't allowed and array indices are all carefully checked by the compiler.

Anyway, there are a bunch of different ways for your code to legally claim some memory, including:

Calling Malloc from Assembly Language

It's a pretty straightforward function: pass the number of *BYTES* you want as the only parameter, in rdi.  "call malloc."  You'll get back a pointer to the allocated bytes returned in rax.  To clean up the space afterwards, copy the pointer over to rdi, and "call free" (I'm leaving off the free below, because you need the stack to do that properly).

Here's a complete example of assembly memory access.  I call malloc to get 40 bytes of space.  malloc returns the starting address of this space in rax (the 64-bit version of eax).  That is, the rax register is acting like a pointer.  I can then read and write from the pointed-to memory using the usual assembly bracket syntax:
mov edi, 40; malloc's first (and only) parameter: number of bytes to allocate
extern malloc
call malloc
; on return, rax points to our newly-allocated memory
mov ecx,7; set up a constant
mov [rax],ecx; write it into memory
mov edx,[rax]; read it back from memory
mov eax,edx; copy into return value register
ret

(Try this in NetRun now!)

Rather than copy via the ecx register, you can specify you want a 32-bit memory write and read using "DWORD" in front of the brackets, like this:
mov edi, 40; malloc's first (and only) parameter: number of bytes to allocate
extern malloc
call malloc
; on return, rax points to our newly-allocated memory
mov DWORD [rax],7; write constant into memory
mov eax,DWORD [rax]; read it back from memory
ret

(Try this in NetRun now!)

Malloc on arrays

The typical place you use malloc is to make some space for a variable-length array.  A bunch of "dd" commands works fine if you know how many integers to allocate (for example, "times 100 dd 0" makes room for a hundred integers), but you have to know how many you need at compile time.

To allocate an array of n integers, you can:
For example:
mov edi,10 ; ten integers in our array
imul edi,4 ; multiply by 4 to get a byte count
extern malloc
call malloc
; rax is a pointer to the allocated space
mov rdi,10; n
mov rcx,0 ; i
jmp testloop
initloop:
mov DWORD[rax+4*rcx],ecx; write to integer at index rcx
add rcx,1 ; i++
testloop:
cmp rcx,rdi
jl initloop

mov eax,DWORD[rax+4*7] ; pull out the integer at index 7
ret

(Try this in NetRun now!)

To allocate an array of n 64-bit "long" values, you just need to replace the "4" bytes/integer above with "8" bytes/long:
mov edi,10 ; ten longs in our array
imul edi,8 ; multiply by 8 to get a byte count
extern malloc
call malloc
; rax is a pointer to the allocated space
mov rdi,10; n
mov rcx,0 ; i
jmp testloop
initloop:
mov QWORD[rax+8*rcx],rcx; write to long at index rcx
add rcx,1 ; i++
testloop:
cmp rcx,rdi
jl initloop

mov rax,QWORD[rax+8*7] ; pull out the long at index 7
ret

(Try this in NetRun now!)


The Stack Pointer

"push" and "pop" are implemented using the "stack pointer" to point to the most recently pushed value.  On x86, the stack pointer is stored in the register called "rsp" (Register: Stack Pointer). 

Conceptually, the stack is divided into two areas: high addresses are all in use and reserved (you can't change these values!), and lower addresses that are unused (free or scratch space).  The stack pointer points to the last in-use byte of the stack.  The standard convention is that when your function starts up, you can claim some of the stack by moving the stack pointer down--this indicates to any functions you might call that you're using those bytes of the stack.  You can then use that memory for anything you want, as long as you move the stack pointer back before your function returns. 

Address
Contents

0x000...000

"low memory"




unused stack area
(you can claim this space)
rsp->
end of reserved data
"top of the stack"

reserved stack data
(main's variables)
0xfff...fff

"high memory"

It's very annoying that the stack starts at high addresses and grows toward lower addresses: everything else on the machine (arrays, malloc space, strings, even integers) starts at low addresses and grows toward higher addresses.  The reason is historical: on ancient machines with only a little memory space to work with, they'd put their data at one end of memory (near address zero), and the stack as far away as it could get, near high memory.  Then the program's data or stack space could grow as far as possible without overwriting the other.  Of course, on a 64-bit machine you've got billions of gigabytes of address space, so you're unlikely to run out no matter which way the stack grows, but we're stuck with the convention that "the stack grows toward lower memory".  Confusingly, the last reserved value (at the lowest address rsp) is called the "top" of the stack.

"push" and "pop" are implemented via the stack pointer:
Sadly, if you screw up the stack, such as by forgetting to pop or move the stack pointer back, or overwriting part of the stack that isn't yours, then the function that called you (such as main) will normally crash horribly.  So be careful with the stack!

Here's how we allocate some space on the stack, then read and write it:
sub rsp,16 ; I claim the next sixteen bytes in the name of... me!

mov QWORD [rsp],1492 ; store a long integer into our stack space
mov rax,QWORD [rsp] ; read our long from where we stored it

add rsp,16 ; Hand back the stack space
ret

(Try this in NetRun now!)

Here's how we'd allocate one hundred long integers on the stack, then use just one of them:
sub rsp,800 ; I claim the next eight hundred bytes

mov rdi,rsp ; points to the start of our 100-integer array
add rdi,320 ; jump down to integer 40 in the array
mov QWORD [rdi],1492 ; store an integer into our stack space
mov rax,QWORD [rdi] ; read our integer from where we stored it

add rsp,800 ; Hand back the stack space
ret

(Try this in NetRun now!)

These are handy if you've only got one integer to stick on or pull off the stack.  In 32-bit mode, push and pop are really useful for function arguments, which by convention in 32-bit mode are stored on top of the stack when you call the function:
push 19
extern print_int
call print_int
pop eax ; MUST clean up the stack
ret

(Try this in NetRun now!) (32-bit mode)

This prints the "19" that's stored on top of the stack.  In 32-bit mode, all function arguments are stored on the stack (unlike registers for 64-bit code).  This means the stack is a rather funny mix of function arguments, local and temporary variables, totally unused space for alignment, etc.

Stack Frames: rbp

There's one fairly handy saved register called rbp, which means "extended base pointer".  Here's the standard use of rbp: to stash the value of the stack pointer at the start of the function.  This is sometimes a little easier than indexing from rsp directly, since rsp changes every time you push or pop--rbp, by contrast, can stay the same through your entire function.
push rbp; stash old value of rbp on the stack
mov rbp,rsp; rbp == stack pointer at start of function

sub rsp,1000 ; make some room on the stack
mov QWORD[rbp-4],7 ; local variables are at negative offsets from the base pointer
mov eax,QWORD[rbp-4]; same local variable

mov rsp,rbp; restore stack pointer (easier than figuring the correct "add"!)
pop rbp; restore rbp
ret
(Try this in NetRun now!)
rbp isn't used very often in 64-bit mode, but in 32-bit mode it's almost standard.   The piece of the stack around the base pointer is often called the function's "stack frame": negative offsets get to the function's local variables, positive offsets get to the caller's parameters, and directly at rbp is the saved copy of the old rbp.  This effectively makes a chain of rbp pointers (assuming every function uses the frame pointer correctly); on some machines you can "unwind the stack" or print a "stack trace" by following this chain of pointers.