Function Calling, and Preserved Registers

CS 301 Lecture, Dr. Lawlor

In assembly, you declare that a function exists using "extern function_name". This has the same purpose as the "#include <iostream>" at the start of your C++ program, so the compiler knows what functions exist.

There's a very handy instruction named "call" that... calls a function. Very simple.

Exactly like your function returns its value in eax, whatever function you call will return its value in eax. So if you want to return that value yourself, you just do *nothing*!

extern read_input
call read_input
ret

(Try this in NetRun now!)

The tricky part is if have some stuff in registers before calling the function, they're probably gone after calling it, because the function can (and often will) change the values in registers, just like you can change them. In a week or so, you'll hear how to store values"on the stack", but for now, just don't do that!

Assembly Registers

Registers are where you store data in assembly language--there aren't any variables, so everything has to either go in registers or somewhere in memory (which we haven't covered yet!).

"Scratch" registers you're allowed to overwrite and use for anything you want. "Preserved" registers serve some important purpose somewhere else, so you have to put them back ("save" the register) if you use them--for now, just leave them alone!

Each of these registers is available in several sizes:

rax is the 64-bit, "long" size register. It was added in 2003. I've marked the added-with-64-bit registers in red below.
eax is the 32-bit, "int" size register. It was added in 1985. I'm in the habit of using this register size, since they also work in 32 bit mode, although I should probably use the longer rax registers for everything.
ax is the 16-bit, "short" size register. It was added in 1979.
al and ah are the 8-bit, "char" size parts of the register. al is the low 8 bits (like ax&0xff), ah is the high 8 bits (like ax>>8). They're original back to 1972.

Curiously, you can write a 64-bit value into rax, then read off the low 32 bits from eax, or the low 16 bitx from ax, or the low 8 bits from al--it's just one register, but they keep on extending it!

rax: 64-bit

eax: 32-bit

ax: 16-bit

For example,

mov rcx,0xf00d00d2beefc03; load 64-bit constant
mov eax,ecx; pull out low 32 bits
ret

(Try this in NetRun now!)

Here's the full list of x86 registers. The 64 bit registers are shown in red. Preserved registers are in italics.

Notes	64-bit	32-bit	16-bit	8-bit
Values are returned from functions in this register. Multiply instructions put the low bits of the result here too.	rax	eax	ax	ah and al
Typical scratch register. Some instructions use it as a counter (such as SAL or REP).	rcx	ecx	cx	ch and cl
Scratch register. Multiply instructions put the high bits of the result here.	rdx	edx	dx	dh and dl
Preserved register: don't use it without saving it!	rbx	ebx	bx	bh and bl
The stack pointer. Points to the top of the stack (wait for the details!)	rsp	esp	sp	spl
Preserved register. Sometimes used to store the old value of the stack pointer, or the "base".	rbp	ebp	bp	bpl
Scratch register. Also used to pass function argument #2 in 64-bit mode (on Linux).	rsi	esi	si	sil
Scratch register. Function argument #1.	rdi	edi	di	dil
Scratch register. These were added in 64-bit mode, so the names are more modern.	r8	r8d	r8w	r8b
Scratch register.	r9	r9d	r9w	r9b
Scratch register.	r10	r10d	r10w	r10b
Scratch register.	r11	r11d	r11w	r11b
Preserved register.	r12	r12d	r12w	r12b
Preserved register.	r13	r13d	r13w	r13b
Preserved register.	r14	r14d	r14w	r14b
Preserved register.	r15	r15d	r15w	r15b

The big problem with registers is they're in *hardware*: you're stuck with the existing names and sizes, and every function has to share them. If you'd defined a compiled language where there are only eight global variables declared in 1972, with weird hardcoded names, you'd be laughed straight to the HR office to be fired!