Writing non 64-bit x86 Assembly, Making Direct Syscalls

CS 301 Lecture, Dr. Lawlor

Non 64-bit x86 code

This whole semester, we've been writing x86 code in 64-bit mode, where registers and pointers are 64 bits long.

In previous years, I've taught this class in 32-bit mode, where registers and pointers are 32 bits long.  "rax" doesn't exist in 32-bit mode, but "eax" still does.  Virtually everything works exactly the same in 64-bit, 32-bit, or even the (much older) 16-bit mode.  In particular, almost all the data movement and arithmetic instructions are ancient.


64-bit
32-bit
16-bit
First Year
2003 (AMD Athlon64)
1985 (Intel 386)
1978 (Intel 8086/8088)
Typical OS
Vista 64
Windows 95 through XP
Windows 3.1, MS DOS, PC boot sector
Pointers
64-bit (8 byte QWORD)
32-bit (4 byte DWORD)
16-bit (2 byte WORD) "near" pointer;
add another 16-bit segment number
for a "far" pointer
Registers
rax, r8-r15, etc.
scratch: eax, ecx, edx
preserved: ebx, esp, ebp, esi, edi
ax, bx, cx, dx, sp, bp, si, di
segment registers: ss, ds, cs, es
Stack
Must be 16-byte aligned
Only 4-byte aligned (watch out for SSE!)
No alignment required,
but usually 2-byte aligned
Parameters
In rdi, rsi, etc
On stack (in most call conventions)
Totally up to you.
SSE
Yes, xmm0-xmm15
Probably, but only xmm0-xmm7
No

For example, this function works fine in 32-bit or 64-bit mode:
mov eax,17
ret

(Try this in NetRun now!)

To call a function like print_int with the parameter 13, in 32-bit mode we need to push it onto the stack:
push 13 ; print_int's first parameter
extern print_int
call print_int
pop eax ; <- clean up stack before returning
ret

(Try this in NetRun now!)

The first parameter is always on top of the stack, at [esp+4].  The second parameter needs to be deeper in the stack, so curiously you must push the leftmost parameter last!
push 1 ; number of floats to print (second argument)
push myFloat ; address of our float (first argument)
extern farray_print
call farray_print
add esp,8 ; Clean up stack

ret ; Done with function

myFloat: dd 1.234

(Try this in NetRun now!)

Other than that, 32-bit assembly is pretty nearly identical to 64-bit assembly!

Make a direct syscall

Normally, to interact with the outside world (files, network, etc) you just call some function, usually the exact same function you'd call from C or C++.  But sometimes, such as when you're implementing a C library, or when there is no C library call to access the functionality you need, you want to talk to the OS kernel directly.  There's a special x86 "interrupt" instruction to do this, called "int". 

On Linux, you talk to the OS by loading up values into registers then calling "int 0x80".  Register rax describes what to do (open a file, write data, etc) and rbx, rcx, rdx, rsi, and rdi have the parameters describing how to do it.  This register-based parameter passing is similar to how we call functions in 64-bit x86, but the Linux kernel uses this convention both in 32 and 64 bit mode.  Other operating systems like BSD store syscall parameters on the stack, like the 32-bit x86 call interface!

Konstantin Boldyshev has a good writeup and examples of Linux, BSD, and BeOS x86 syscalls, and a list of common Linux syscalls.  (The full list of Linux syscalls is in /usr/include/asm/unistd_32.h.)  Here's a 64-bit version of his Linux example:
push rbx  ; <- we'll be using ebx below, and it's a saved register (hallelujah!)

; System calls are listed in "asm/unistd.h"
mov rax,4 ; the system call number of "write".
mov rbx,1 ; first parameter: 1, the stdout file descriptor
mov rcx,myStr ; data to write
mov rdx,3 ; bytes to write
int 0x80 ; Issue the system call

pop rbx ; <- restore ebx to its old value
ret

section .data
myStr:
db "Yo",0xa

(Try this in NetRun now!)


Windows system call numbers keep changing, so direct system calls aren't at all easy to use on Windows.  (This is partly a security feature, to make it harder to write portable viruses...)