Any time you're doing low level memory manipulation, you need to keep in mind how many bytes are used by an object. For example, to allocate memory, you need the size in bytes. To do pointer manipulation, like skipping over an object to find the next object, you need the size in bytes to know how far to skip.
Array type | First element | Move to next | Access i'th element |
Array of char (string) | BYTE [ ptr ] | ptr+1 | BYTE [ ptr + i ] |
Array of integer | DWORD [ ptr ] | ptr+4 | DWORD [ ptr + 4*i ] |
Array of longs | QWORD [ ptr ] | ptr+8 | QWORD [ ptr + 8*i ] |
There's a handy keyword "sizeof" built into C and C++, that you can apply to any variable or type name, to find out how many bytes of storage it requires. For example, if you want to know if you're running on a 64-bit machine, where "long" is 64 bits or 8 bytes, you just print the size:
return sizeof(long);
This returns 8 on my machines, indicating "long" takes 8 bytes.
This provides an interesting way to see inside objects, by measuring their size. For example, this secretive class is still just 8 bytes, indicating in memory it's just one "long" integer. You can even typecast a class pointer to a long int pointer, dereference the pointer, and there's the value. No real mystery!
class arcane_mystery { private: long dark_secrets; }; return sizeof(arcane_mystery);
There isn't
any equivalent to "sizeof" in assembly language--you just need to
remember the sizes of everything yourself!
In assembly
you can easily convert between pointer and numeric types without
doing anything special--read QWORD[rcx], add to rcx, write
BYTE[rcx]. In C or C++, you need to typecast pointers to
convert data between types. An example of a C-style cast is:
return *(long *)&foo;
This converts
the bytes of the foo function itself (!) into a long integer, and
returns it (
0xC3FFFFFFF9058B48). In C++, you can use the same simple C
syntax, or a fancier looking C++ cast that produces exactly the
same results:
return *reinterpret_cast<long *>(&foo);
By default, a string defined with "db" is treated as part of the program's executable code, so the string's bytes can't be modified--if you write to BYTE[rdi], the program will crash with a write to unwriteable memory. But you can tell the assembler with "section .data" to put the string into modifiable memory, which you can read or write:
mov rdi, daString ; pointer to string mov BYTE [rdi+0], 'Y'; change the string's bytes extern puts call puts ; print the string ret section .data ; switch storage mode to modifiable data daString: db `No.`,0 ; sets bytes of string in memory
Again, there are a number of "sections" you can access. By default everything's in the code section ".text". The section directive applies to everything listed in the code until you hit another section directive.
Name | Use | Discussion |
section .data | r/w data | This data is initialized, but can be modified. |
section .rodata | r/o data | This data can't be modified, which lets it be shared across copies of the program. |
section .bss | r/w space | This is automatically initialized to zero, meaning the contents don't need to be stored explicitly. |
section .text | r/o code | This is the program's executable machine code (it's binary data, not plain text!). |
In C or C++,
global or static variables get stored in section .data if you give
them an initial value. If they don't have an initial value,
they're put in section .bss to get zero-initialized on load.
If they're "const", they go in .rodata.
We've been using the stack all semester, first in the form of the "call" and "ret" instructions, and later in the form of "push" and "pop" instructions. Internally, the stack is represented via a pointer stored in the register rsp. The general rules for use of this stack pointer register are:
This means you can actually manually allocate space on the stack, by:
Exactly like "call" corresponds to pushing the return address and jumping to the called function, and "ret" corresponds to popping the return address and jumping there, you can fake a push thingy by allocating 8 bytes of stack space and then mov QWORD [rsp], thingy, and you can fake a pop thingy by mov thingy, QWORD [rsp]and then release 8 bytes of stack space.
More generally, if you need to allocate dynamic storage for use during a single function call, the stack is an easy and efficient place to do it.
The only downside of stack allocation is you MUST release your stack space before you can return from your function. This means long-lived data structures, like a user info class used for the rest of the program, need to be allocated somewhere else.
* It's quite weird that unlike everything else on the machine, the stack "grows down", allocating by subtracting from the pointer value. This is despite the most recent allocation being called the "top of the stack", when it's really the smallest address--this kinda makes sense if you think of writing out all the memory values at increasing addresses from top to bottom, but that's not usually how memory is shown! There are a few advantages to making the stack grow down. First, if you allocate an array of a few longs, the stack pointer also points to the start of the newly created array: the first one is stored in memory at QWORD [rsp], and the next one is at QWORD [rsp+8] (but if the stack grew upward, like everything else, you'd be pointing at the *end* of the array after allocating it by moving the pointer, requiring more error-prone pointer shuffling). Second, since the heap grows upward, toward larger pointer values, the program's entire memory is usually organized with the heap down low at one end of memory, and the stack up high near the other end of memory, and the unused memory in the middle can be traded off between stack and heap, whichever needs it.
malloc is the standard C way to allocate memory from "the heap", the area of memory where most of a program's stuff is stored. Unlike the C++ "new" operator, malloc doesn't explicitly know which data type it's allocating, since its only parameter is the number of bytes to allocate. This means you need a pointer typecast to add a type to the bare pointer returned by malloc.
Plain C, also works in C++ | C++ |
int *arr=(int *)malloc(100*sizeof(int)); | int *arr=new int[100]; |
myClass *c=(myClass *)malloc(sizeof(myClass)); | myClass *c=new myClass; |
To free memory, call the function free, like "free(ptr);". You need to eventually call free exactly once for every malloc. As usual, you'll get a horrible crash, sometimes delayed and sometimes instant, if you free the same block more than once, free memory that didn't come from malloc, free memory and then somebody keeps using it, or many other memory misdeeds. If you forget to call free, you won't get an immediate crash, but in a long-running program like a network server, these un-freed objects will build up and eventually consume all the server's memory, causing it to slow down and eventually crash.
int len=3; int *arr=(int *)malloc(len*sizeof(int)); int i; for (i=0;i<len;i++) arr[i]=7+i; iarray_print(arr,len); free(arr);
Unlike C++ "new", it's easy to call malloc from assembly. In fact, you just "call malloc", with the number of bytes to allocate in rdi, and you get back a pointer in rax. Here's a simple example where we allocate space for one long, write into the pointer returned by malloc, and immediately free it.
mov rdi,8 ; a byte count to allocate extern malloc call malloc ; rax is start of our array mov QWORD[rax],3 ; yay writeable memory! mov rdi,rax ; pointer to deallocate extern free call free ret
Here's a complete example, including looping and printing the array, equivalent to the above C code:
push rbx ; save register mov rbx,3 ; number of integers (4-byte DWORDs) to allocate mov rdi,rbx ; integer count imul rdi,4 ; now a byte count extern malloc call malloc ; rax is start of our array mov rcx,0 ; loop index jmp loopTest loopStart: mov rdx,7 add rdx,rcx ; ==7+i mov DWORD[rax+4*rcx],edx ; arr[i]=7+i (int version) add rcx,1 loopTest: cmp rcx,rbx ; i<len jl loopStart mov rdi,rax ; pointer to start of array mov rsi,rbx ; number of integers extern iarray_print push rax ; save pointer call iarray_print pop rax ; restore pointer mov rdi,rax; pointer to memory to free extern free call free pop rbx ; restore register ret
Storage Area | Allocate | Deallocate |
Static read-only constant |
readHere: dq 3 |
n/a |
Static read-write data |
section .data readWriteHere: dq 3 |
n/a |
The Stack | sub rsp, nBytes ; ptr = rsp |
add rsp,nBytes |
The Heap | mov rdi, nBytes call malloc ; ptr = rax |
mov rdi, ptr call free |