Calling Assembly from C/C++

CS 301 Lecture, Dr. Lawlor
So you've got some source code.  You want to make an executable.  How does this happen?

1. Build object files

Step 1 is to compile the source code.  If you've got lots of different source files, you want to build them each into not an entire program, but a little piece called an "object file".   Object files consist of compiled machine code, but with special hooks to allow it to be combined with other object files into a single executable.

For C++:
On Windows, object files have extension ".obj", and you make them with the "/c" flag.  You compile C++ with the "cl" program, and the "/TP /GR /EHsc" flags (that's C++, with dynamic_cast, and throw respectively). So this command creates "foo.obj" from "foo.cpp":
	cl /TP /GR /EHsc foo.cpp /c
On Linux, MacOS X, or other UNIX-like systems, object files normally have extension ".o", and you make them with the "-c" flag.  So this will create "foo.o" from "foo.cpp":
	g++ foo.cpp -c
(also see the Compiler Flags listed below)

Here's how you compile plain C (not C++). If calling from C++, be sure to use extern "C"!
On Windows, use the "/TC" flag to compile C.
	cl /TC foo.c /c
On UNIX, use gcc to compile plain C (not C++).
	gcc foo.c -c

For YASM assembly:
On Windows, use "win32" format (or win64 on a 64-bit OS)
	yasm -f win32 foo.S -o foo.obj
(Note: symbol names like "foo" seem to have an underscore prefix on Windows, so you have to write "_foo" or "_printf" from assembly.)
On UNIX, use "elf32" format (or elf64 on a 64-bit OS)
	yasm -f elf32 foo.S -o foo.o 

2: Link object files into libraries

Step 2 is optional.  If you've got a zillion object files that are related, you can put them together into a "library".  For now, we'll only look at statically linked libraries, not dynamically-linked libraries (DLLs).

On Windows, static library files have extension ".lib", and you make them with the "link /lib" tool.  So this creates "foo.obj" from "foo.c":
	link /lib /out:foo.lib foo.obj
On Linux static library files have extension ".a", and you make them with the "ar cr" tool.  So this will create "foo.a" from "foo.o" and "bar.o":
	ar cr foo.a foo.o

Some machines, like MacOS X, require you to run "ranlib foo.a" after this.  Also, "foo.a" remembers several strange things like the order you added the files, and "ar cr" won't ever *remove* .o files from your .a; so it's a good idea to remove your .a's before running "ar cr"...

3: Build executable

Step 3 is to combine all your object files and libraries into a single executable.  This step is called "linking".

On Windows, executables are named ".exe".    You can either list the libraries you need on the command line, or else
cl /o bar.exe bar.obj foo.lib
On Linux executables have no filename extension.  You specify the executable name with the "-o" flag.  You can also build executables yourself with "ld", but it's trickier, especially for C++.  You MUST list all the needed libraries on the command line.
g++ -o bar bar.o foo.a

Often, you don't call these programs yourself.  Instead, you let the IDE (e.g., MS Visual C++) call them for you.  Or you write a "Makefile" and let "make" call the programs needed.

Compiler/Linker Flags

There are a bunch of "flags" that you can pass to the compiler and linker to make various stuff happen.  Most of these are useful only once in a while, but when needed, they're really useful!

Name
Examples
For
Does
Needed when
-c

Compiler
Compile only, don't link.  Makes an object file (.o or .obj) from a source code file.
Compiling big programs with lots of pieces because you can leave most code compiled as .obj files.  Also useful prior to building libraries.
-Dmacro=value
-DUserID=17
-DMAXCRAP=99
Compiler
Sets a macro (just like #define) from the compiler command line.
Setting up configuration values, paths, etc.  Another alternative is to write a "config.h" file somewhere that sets the same macros; "config.h" can make the compiler command lines a lot more intelligible!
-Ipath
-Ilibfoo/include
-I.
Compiler Adds a new directory to the "include path"; the list of places the compiler looks for #included files.
Compiling code that uses header files in some other directory.  (Subtle: #include "foo.h" works automatically if foo.h is in the current directory; but #include <foo.h> only works if you specify -I. to add the path to the header file.  Also consider using something like #include "libfoo/include/foo.h")
-Lpath
-Llibfoo/lib
-L.
Linker
Adds a new directory to the "library path"; the list of directories the compiler looks for libraries (.a or .lib files) inside.
Linking with almost any library other than the builtin system libraries.
-lname
-lfoo
Linker
Looks for a file named "libname.a" in all the known library directories.  UNIX-only. Linking almost any library on UNIX.

Alternative: "Inline" Assembly

So it's actually possible on most compilers to include a little bit of assembly code right inside your C or C++ file, called "inline assembly" because the assembly is inside the C/C++.  This is usually a bit faster (because no function call overhead) and simpler (less hassle at build time) than having a separate ".S" file that you run through YASM and then link to the C or C++ code.  However, for long stretches of assembly, a separate file still works better.

Microsoft Inline Assembly:

Here's the same example in Microsoft Visual C or C++ inline assembly:
int foo(void) {
__asm{
mov eax,100
leave
ret
};
}
Note that now:
  • The keyword is __asm
  • The assembly code is wrapped in curly braces
  • The destination register is on the *left*, just like yasm.

Note also that I've used the "leave" instruction to clean up foo's stack frame (mov esp,ebp; pop ebp;) before returning.  The compiler secretly generates the corresponding function prologue at the start of the function.

Microsoft Outside Variable Access:

In Microsoft Visual C or C++, you can read and write variables from the program by just giving their names.

Simple example:
   void *frame;
   __asm mov ebp,frame;
Complicated example:
int foo(void) {
int joe=1234, fred;
__asm{
mov eax,joe ; eax = joe;
add eax,2 ; eax += 2;
mov fred,eax ; fred = eax
};
return fred;
}
This is clearly very convenient!  But what happens if we try to do the same thing with a variable named "al"?  (Remember, "al" is a register on x86!)

GCC Inline Assembly:

Here's an example of how to declare a little assembly snippet inside C or C++ code using the Linux/UNIX/MacOS gcc compiler:
int foo(void) {
__asm__( /* Assembly function body */
" mov $100,%eax\n" /* moves 100 into eax! */
" leave\n"
" ret\n"
);
}

Note that:
  • The keyword is __asm__
  • The assembly code is wrapped in parenthesis.
  • The assembly code shows up as a string
  • There are weird symbols in front of constants ($ means constant) and registers (% means register)
  • DYSLEXIA ALERT: GCC sasembly is abckwards.  The destination register goes at the *end* of the instruction.
I've linked the text to the NetRun version of this code.  Note that I've set the NetRun "Mode" to "Whole Subroutine"--this keeps NetRun from pasting in the start and end of the foo subroutine.

The bottom line is just to use the __asm__ keyword, which takes the assembly code as a big string.

GCC Outside Variable Access:

Accessing outside variables is truly hideous in gcc inline assembly.

Simple example:
    void *frame; /* Frame pointer */
   __asm__ ("mov %%ebp,%0":"=r"(frame));
Complicated example:
int foo(void) {
int joe=1234, fred;
__asm__(
" mov %1,%%eax\n"
" add $2,%%eax\n"
" mov %%eax,%0\n"
:"=r" (fred) /* %0: Out */
:"r" (joe) /* %1: In */
:"%eax" /* Overwrite */
);
return fred;
}
The __asm__ keyword can take up to four strings, separated by colons:
  • The assembly code.  Now registers need to be prefixed with "%%", not just "%", to distinguish them from arguments.
  • A comma-separated list of output arguments.  These can go into registers ("=r"), memory ("=m"), etc.
  • A comma-separated list of input arguments.
  • A comma-separated list of overwritten registers ("trashed" registers).  The compiler then knows not to put anything important in these registers.
See page 227 of the text for more details, or the gcc manual for so many hideous details you'll want to cry.

GCC Whole Subroutine in Assembly

Partly because GCC's inline assembly syntax is so horrible, it's often easier to just write the whole subroutine (argument access, frame setup, and value return) in assembly.  There doesn't seem to be a way to do this in Visual C++, although (in either case) it's easy enough to separately compile a whole file full of pure assembly code and just link it in.

To write a subroutine in assembly, just:
  • Write a C function prototype.  In C++, make the prototype 'extern "C"' to avoid a link error.
  • Put your code in an "__asm__" block outside any subroutine.
  • Put the subroutine name at the start of the assembly block as a label.
  • If you want to call the subroutine from outside that file, use ".globl my_sub" to make the subroutine's name visible outside.
Here's a complete example, where my assembly subroutine just returns 100:
extern "C" int my_sub(void); /* Prototype */

__asm__( /* Assembly function body */
"my_sub:\n"
" mov $100,%eax\n"
" ret\n"
);

int foo(void) {
return my_sub()+1;
}

This is actually a pretty clean way to do inline assembly in gcc, although you do have to remember the calling convention to find your arguments!  (Hint: the first argument is at 4(%esp) if you don't have a stack frame; 8(%ebp) if you do...)