Calling Functions in Assembly

CS 301: Assembly Language Programming Lecture, Dr. Lawlor

As a reminder, on our x86-64 linux machines:

rax is the return value register
rcx and rdx are "scratch" registers you're allowed to use for temporary values
rdi is the first function argument
rsi is the second function argument

These are the same registers used by your foo function, as well as any other functions you call.

Calling Functions in Assembly

The instruction "call" is used to call another function. A called function can then return using "ret".

By default, the assembler assumes functions you call are defined later in the same file. To call an external function, such as NetRun's "print_long", or a standard C library function like "exit", you need to tell the assembler the function is "extern". "extern" isn't actually an instruction--it doesn't show up in the disassembly--it's just a message to the assembler (often called a pseudoinstruction). In C++ or C, header files tend to contain declarations of external functions, and so play the same role as extern in the assembler.

Here's how we'd call the UNIX function "exit", which ends the program:

extern exit ; tell assembler function is defined elsewhere
call exit ; call the function

(Try this in NetRun now!)

Here's how we'd call the standard C library function "getchar", which reads one ASCII character from cin, and returns it in rax.

extern getchar
call getchar
ret

(Try this in NetRun now!)

Calling I/O Functions in Assembly

The standard C library includes several functions for doing input/output (I/O). These require an "aligned stack" ending in address 0, which for our purposes means you need an *odd* number of pushes before calling the functions. If you don't have anything useful to push, you can push a random register like rdx. As usual, you need to pop everything you pushed before you return.

The NetRun function "read_input" parses an integer from cin, and returns it in rax:

push rdx ; align stack
	extern read_input
	call read_input
	; rax has the read-in integer now
	add rax,10000
pop rdx ; clean up stack
ret

(Try this in NetRun now!)

To show integers, "print_long" takes one long integer in rdi, and prints it on the screen:

push rdx ; align stack
	mov rdi, 17 ; function argument goes into print_long in rdi
	extern print_long
	call print_long
pop rdx ; clean up stack
ret

(Try this in NetRun now!)

One caution: if you call any function, that function has a perfect right to use rax, rcx, rdx, rsi, rdi, and the other registers you can use. This makes it a little tricky to store data across a function call. For example, print_long will use rax, so this won't actually return 5:

push rdx ; align stack
	mov rax, 5 ; I hope to return this
	mov rdi, 3 ; function argument for print_long
	extern print_long
	call print_long
; CAUTION: print_long trashed our rax!
pop rdx ; clean up stack
ret

The solution is to save our register on the stack before calling a function that might trash it.

mov rax, 5 ; I hope to return this
push rax ; save it on the stack (also aligns the stack)
	mov rdi, 3 ; function argument for print_long
	extern print_long
	call print_long
; CAUTION: print_long trashed rax!
pop rax ; restore our rax (and clean up stack)
ret

(Try this in NetRun now!)

Summary of the rules for calling functions on Linux x86-64:

The stack needs to be aligned before calling a function (functions that do I/O will crash if this isn't true, but it's a good idea regardless)

You need to push an odd number of times (1 push, 3 pushes, 5 pushes, etc.) to leave the stack aligned after the call.

A function's first argument arrives in rdi, and its second argument in rsi
The function is allowed to trash all scratch registers (rax, rcx, rdx, rdi, rsi, r8-r11) without warning
The function will not change preserved registers (rbx, rbp, r12-r15)
The function can push things, but it needs to pop them before returning.

Your function also obeys these rules!

Defining Functions in Assembly

A function defined in assembly looks just like a jump label, but then ends with a "ret". Here's an example:

mov rdi,7 ; pass a function parameter
call otherFunction  ; run the function below, until it does "ret"
add rax,1 ; modify returned value
ret

otherFunction: ; a function "declaration" in assembly
	mov rax,rdi ; return our function's only argument
	ret

(Try this in NetRun now!)

Notice how a function declaration in assembly looks exactly like a goto label. The only real difference is you can get back from a function by calling "ret" to return to whoever called you.

If you mix up call / ret and jmp / jmp, often parts of your code will get skipped over. For example, if we replace "call otherFunction" with "jmp otherFunction", then otherFunction's ret will return all the way back to main, and we never do the add.

	Getting There	Getting Back
function	call somewhere	ret *
goto	jmp somewhere	jmp backToYou

* How does it know where to return to? Call stores the return address to jump back to on the stack, like a push. Ret pops that address.

By default, functions you declare in assembly are only visible in your file. You can make those functions visible from outside using the "global" directive, another pseudo-instruction like "extern" (and performing the same sort of operation). In "Inside a Function" mode, NetRun automatically declares your function this way, but if you switch to "Whole Functions" mode, you can take the training wheels off and declare it yourself:

global foo ; make "foo" visible from outside this file
foo: ; a function is just a jump label
	mov rax,3
	ret

(Try this in NetRun now!)