Inline Assembly

CS 301 Lecture, Dr. Lawlor

So it's actually possible on most compilers to include a little bit of assembly code right inside your C or C++ file.  This is usually a bit simpler than having a separate ".asm" file that you run through YASM and then link to the C or C++ code.  However, for long stretches of assembly, a separate file still works better.

GCC Inline Assembly:

Here's an example of how to declare a little assembly snippet inside C or C++ code using gcc:
int foo(void) {
__asm__( /* Assembly function body */
" mov $100,%eax\n"
" leave\n"
" ret\n"
);
}

I've linked the text to the NetRun version of this code.  Note that I've set the NetRun "Mode" to "Whole Subroutine"--this keeps NetRun from pasting in the start and end of the foo subroutine.

The bottom line is just to use the __asm__ keyword, which takes the assembly code as a big string.

Note also that I've used the "leave" instruction to clean up foo's stack frame before returning.  The compiler actually generates the corresponding function prologue.

GCC Outside Variable Access:

Accessing outside variables is truly hideous in gcc inline assembly.

Simple example:
    void *frame; /* Frame pointer */
   __asm__ ("mov %%ebp,%0":"=r"(frame));
Complicated example:
int foo(void) {
int joe=1234, fred;
__asm__(
" mov %1,%%eax\n"
" add $2,%%eax\n"
" mov %%eax,%0\n"
:"=r" (fred) /* %0: Out */
:"r" (joe) /* %1: In */
:"%eax" /* Overwrite */
);
return fred;
}
The __asm__ keyword can take up to four strings, separated by colons:
  • The assembly code.  Now registers need to be prefixed with "%%", not just "%", to distinguish them from arguments.
  • A comma-separated list of output arguments.  These can go into registers ("=r"), memory ("=m"), etc.
  • A comma-separated list of input arguments.
  • A comma-separated list of overwritten registers ("trashed" registers).  The compiler then knows not to put anything important in these registers.
See page 227 of the text for more details, or the gcc manual for so many hideous details you'll want to cry.

GCC Whole Subroutine in Assembly

Partly because GCC's inline assembly syntax is so horrible, it's often easier to just write the whole subroutine (argument access, frame setup, and value return) in assembly.  There doesn't seem to be a way to do this in Visual C++, although (in either case) it's easy enough to separately compile a whole file full of pure assembly code and just link it in.

To write a subroutine in assembly, just:
  • Write a C function prototype.  In C++, make the prototype 'extern "C"' to avoid a link error.
  • Put your code in an "__asm__" block outside any subroutine.
  • Put the subroutine name at the start of the assembly block as a label.
  • If you want to call the subroutine from outside that file, use ".globl my_sub" to make the subroutine's name visible outside.
Here's a complete example, where my assembly subroutine just returns 100:
extern "C" int my_sub(void); /* Prototype */

__asm__( /* Assembly function body */
"my_sub:\n"
" mov $100,%eax\n"
" ret\n"
);

int foo(void) {
return my_sub()+1;
}

This is actually a pretty clean way to do inline assembly in gcc, although you do have to remember the calling convention to find your arguments!  (Hint: the first argument is at 4(%esp) if you don't have a stack frame; 8(%ebp) if you do...)

Microsoft Inline Assembly:

Here's the same example in Microsoft Visual C or C++ inline assembly:
int foo(void) {
__asm{
mov eax,100 ; Moves 100 into eax!
leave
ret
};
}
Note that now:
  • The keyword is __asm
  • The assembly code is wrapped in curly braces
  • The assembly code isn't inside a string
  • Registers and constants don't have weird characters in front of them.
  • The destination register is now on the *left*.

Microsoft Outside Variable Access:

In Microsoft Visual C or C++, you can read and write variables from the program by just giving their names.

Simple example:
   void *frame;
   __asm mov ebp,frame;
Complicated example:
int foo(void) {
int joe=1234, fred;
__asm{
mov eax,joe ; eax = joe;
add eax,2 ; eax += 2;
mov fred,eax ; fred = eax
};
return fred;
}
This is clearly very convenient!  But what happens if we try to do the same thing with a variable named "al"?  (Remember, "al" is a register on x86!)