Accessing Parameters, and Linking Assembly with C/C++ Code
CS 301 Lecture, Dr. Lawlor
Here's some C++ code that calls an external function "bar". Note
that this code gives a link error when you try to run it in NetRun,
because "bar" is never defined.
extern "C"
int bar(int a,int b,int c);
int foo(void) {
return bar(0xA0B1C2D3, 0xE0E1E2E3, 0xF0F1F2F3);
}
(executable NetRun link)
We can actually write this "bar" function in assembly, like this:
global bar
bar:
mov eax,[esp+4]
ret
(executable NetRun link)
The "global" keyword in assembly tells the assembler to make a symbol visible from outside the file.
The "Link With:" box tells NetRun to link together two different projects, in this case one in C++ and the other in assembly.
The "extern "C"" in the C++ code tells C++ to just look for a
C-style plain function "bar", instead of a fancy overloaded C++
function.
Frame Pointer
It's pretty common for compiler-generated code, or long human-written
assembly code, to use a "frame pointer". The problem the frame
pointer is trying to solve is that esp keeps moving around as you add
and remove stuff from the stack. So the frame pointer is just a
copy of the stack pointer from somewhere early in the function
execution.
For example, we can start with our argument-fetching assembly code from before:
global bar
bar:
mov eax,[esp+4]
ret
(executable NetRun link)
Say we need to make some space on the stack for an array. Now our code becomes:
global bar
bar:
sub esp,100
mov eax,[esp+104]
add esp,100
ret
(executable NetRun link)
Note that because esp moved down, we have to adjust our accesses to get to the same locations.
If, instead, we make a copy of the "old" esp (for example in register ecx), then we have a fixed point of reference in memory:
global bar
bar:
mov ecx,esp ;<- backup copy of old stack pointer
sub esp,100
mov eax,[ecx+4] ;<- always our first argument, regardless of the current value of esp
add esp,100 ;<- "mov esp,ecx" would work here too!
ret
(executable NetRun link)
It's traditional to use register "ebp" (Extended Base Pointer) to store
the old value of the stack pointer. The compiler always sets up
register ebp in every function (unless you ask it to omit the frame
pointer with "-fomit-frame-pointer").
Unfortunately, ebp is a "callee saved" register--you can't just start
using the value like you can with eax through edx, you have to make
sure you set it back to the old value (just like the stack
pointer!). So it's traditional to push and pop ebp at the start
and end of your function, like this:
global bar
bar:
push ebp ;<- save the old ebp onto the stack (warning: this does change esp!)
mov ebp,esp ;<- backup copy of old stack pointer
sub esp,100
mov eax,[ebp+8] ;<- always our first argument, regardless of the current value of esp
add esp,100 ;<- or "mov esp,ebp"
pop ebp ; <- restore the old ebp, so we don't crash after we return
ret
(executable NetRun link)
The push-and-move at the start of the function is often called the
"function prologue"; and the restore-and-pop at the end is often called
the "function epilogue". Your function arguments are always at
+8, +12, +16, ... bytes from ebp, and your local variables are always
at negative offsets from ebp. Because ebp is callee saved, you
don't have to worry that print_int is going to change your ebp
value--it's every function's responsibility to preserve esp and ebp
from their caller.
Personally, I don't like ebp very much. I almost never use ebp
for short functions. Sometimes it does make things a bit simpler
for long functions, but it costs some performance and annoyance to set
the thing up. It's entirely optional, so you can decide for
yourself whether you'd like to use it in your own assembly
functions.
Array Indexing
x86 assembly language actually supports a really weird addressing mode
called "scaled register indirect", which is really useful for accessing
array elements.
Recall that to get to array index i, we start at the array base pointer
p, then move up by the array element size times the array index, like
"*(p+i)" in C. So if ecx is the array base pointer, edx is the
array index, and it's an array of (4-byte) ints, you can use "[ecx + 4
* edx]" in assembly, like so:
sub esp,100 ;<- claim 100 bytes on the stack (25 ints)
mov ecx,esp ;<- start of array is *lowest* address
mov ebx,0xF00Dbad;
mov edx,7; <- array index seven
mov [ecx + 4*edx],ebx ; <- copy edx into ecx array at index edx
mov eax,[ecx + 4*edx]; <- read ecx array at index edx
add esp,100 ;<- release our stack space
ret
(executable NetRun link)
Examples
This program makes a little array on the stack, fills it with values,
and prints those values. Note that it does use ebp as a frame
pointer:
global foo
foo:
push ebp ;<- save copy of ebp on stack
mov ebp,esp ;<- ebp is the "frame pointer" (a non-moving copy of esp)
sub esp,60 ;<- claim 60 bytes on the stack (15 ints: edx, ebx, and a 13-int array)
mov edx,esp ;<- start of array is *lowest* address
mov [ebp-4],edx; <- backup copy of start of array
; Fill the array with values
mov ebx,0 ;<- index into array
my_fill_loop:
mov [edx+4*ebx],ebx ; <- in C, this would be "int *edx=...; edx[ebx] = ebx;"
add ebx,1
cmp ebx,13
jl my_fill_loop
; Print out the values in my array
mov ebx,0
my_print_loop:
mov eax,[edx+4*ebx]
mov [ebp-8], ebx ;<- save the latest copy of ebx on the stack
push eax
extern print_int
call print_int
pop eax
mov ebx,[ebp-8]; <- restore ebx from backup
mov edx,[ebp-4]; <- restore edx from backup
add ebx,1
cmp ebx,13
jl my_print_loop
mov esp,ebp ;<- restore old esp ("add esp,60" would work here too)
pop ebp ;<- restore old value of ebp
ret
(executable NetRun link)