Accessing Parameters, and Linking Assembly with C/C++ Code

CS 301 Lecture, Dr. Lawlor

Here's some C++ code that calls an external function "bar". Note that this code gives a link error when you try to run it in NetRun, because "bar" is never defined.

extern "C"
int bar(int a,int b,int c);

int foo(void) {
	return bar(0xA0B1C2D3, 0xE0E1E2E3, 0xF0F1F2F3);
}

(executable NetRun link)

We can actually write this "bar" function in assembly, like this:

global bar
bar:
  mov eax,[esp+4]
  ret

(executable NetRun link)

The "global" keyword in assembly tells the assembler to make a symbol visible from outside the file.

The "Link With:" box tells NetRun to link together two different projects, in this case one in C++ and the other in assembly.

The "extern "C"" in the C++ code tells C++ to just look for a C-style plain function "bar", instead of a fancy overloaded C++ function.

Frame Pointer

It's pretty common for compiler-generated code, or long human-written assembly code, to use a "frame pointer". The problem the frame pointer is trying to solve is that esp keeps moving around as you add and remove stuff from the stack. So the frame pointer is just a copy of the stack pointer from somewhere early in the function execution.

For example, we can start with our argument-fetching assembly code from before:

global bar
bar:
  mov eax,[esp+4]
  ret

(executable NetRun link)

Say we need to make some space on the stack for an array. Now our code becomes:

global bar
bar:
  sub esp,100
  mov eax,[esp+104]
  add esp,100
  ret

(executable NetRun link)

Note that because esp moved down, we have to adjust our accesses to get to the same locations.
If, instead, we make a copy of the "old" esp (for example in register ecx), then we have a fixed point of reference in memory:

global bar
bar:
  mov ecx,esp ;<- backup copy of old stack pointer
  sub esp,100
  mov eax,[ecx+4] ;<- always our first argument, regardless of the current value of esp
  add esp,100 ;<- "mov esp,ecx" would work here too!
  ret

(executable NetRun link)

It's traditional to use register "ebp" (Extended Base Pointer) to store the old value of the stack pointer. The compiler always sets up register ebp in every function (unless you ask it to omit the frame pointer with "-fomit-frame-pointer"). Unfortunately, ebp is a "callee saved" register--you can't just start using the value like you can with eax through edx, you have to make sure you set it back to the old value (just like the stack pointer!). So it's traditional to push and pop ebp at the start and end of your function, like this:

global bar
bar:
  push ebp ;<- save the old ebp onto the stack (warning: this does change esp!)
  mov ebp,esp ;<- backup copy of old stack pointer
  sub esp,100
  mov eax,[ebp+8] ;<- always our first argument, regardless of the current value of esp
  add esp,100 ;<- or "mov esp,ebp"
  pop ebp ; <- restore the old ebp, so we don't crash after we return
  ret

(executable NetRun link)

The push-and-move at the start of the function is often called the "function prologue"; and the restore-and-pop at the end is often called the "function epilogue". Your function arguments are always at +8, +12, +16, ... bytes from ebp, and your local variables are always at negative offsets from ebp. Because ebp is callee saved, you don't have to worry that print_int is going to change your ebp value--it's every function's responsibility to preserve esp and ebp from their caller.

Personally, I don't like ebp very much. I almost never use ebp for short functions. Sometimes it does make things a bit simpler for long functions, but it costs some performance and annoyance to set the thing up. It's entirely optional, so you can decide for yourself whether you'd like to use it in your own assembly functions.

Array Indexing

x86 assembly language actually supports a really weird addressing mode called "scaled register indirect", which is really useful for accessing array elements.

Recall that to get to array index i, we start at the array base pointer p, then move up by the array element size times the array index, like "*(p+i)" in C. So if ecx is the array base pointer, edx is the array index, and it's an array of (4-byte) ints, you can use "[ecx + 4 * edx]" in assembly, like so:

  sub esp,100 ;<- claim 100 bytes on the stack (25 ints)
  mov ecx,esp ;<- start of array is *lowest* address

  mov ebx,0xF00Dbad;
  mov edx,7; <- array index seven
  mov [ecx + 4*edx],ebx ; <- copy edx into ecx array at index edx

  mov eax,[ecx + 4*edx]; <- read ecx array at index edx

  add esp,100 ;<- release our stack space
  ret

(executable NetRun link)

Examples

This program makes a little array on the stack, fills it with values, and prints those values. Note that it does use ebp as a frame pointer:

global foo
foo:
  push ebp ;<- save copy of ebp on stack
  mov ebp,esp ;<- ebp is the "frame pointer" (a non-moving copy of esp)

  sub esp,60 ;<- claim 60 bytes on the stack (15 ints: edx, ebx, and a 13-int array)
  mov edx,esp ;<- start of array is *lowest* address
  mov [ebp-4],edx; <- backup copy of start of array

; Fill the array with values
  mov ebx,0 ;<- index into array
my_fill_loop:
  mov [edx+4*ebx],ebx  ; <- in C, this would be "int *edx=...; edx[ebx] = ebx;"
  add ebx,1
  cmp ebx,13
  jl my_fill_loop

; Print out the values in my array
  mov ebx,0
my_print_loop:
  mov eax,[edx+4*ebx]

  mov [ebp-8], ebx ;<- save the latest copy of ebx on the stack
  push eax
  extern print_int
  call print_int
  pop eax
  mov ebx,[ebp-8]; <- restore ebx from backup
  mov edx,[ebp-4]; <- restore edx from backup

  add ebx,1
  cmp ebx,13
  jl my_print_loop

  mov esp,ebp ;<- restore old esp ("add esp,60" would work here too)
  pop ebp ;<- restore old value of ebp
  ret

(executable NetRun link)