Inline Assembly: Mixing Assembly with C/C++

It's even possible on most compilers to include a little bit of assembly code right inside your C or C++ file, called "inline assembly" because the assembly is inside the C/C++.  How you do this depends on what compiler you're using.

The downside of inline assembly is you're tied to a particular combination of hardware instruction set, and software interface to access it.  The upside is you can access the full feature set of the bare hardware, without needing to write the entire program in assembly.

Microsoft Inline Assembly:

Here's a simple example in Microsoft Visual C++ inline assembly:
int foo(void) {
__asm{
mov eax,100
leave
ret
};
}
Note that now:
  • The keyword is __asm
  • The assembly code is wrapped in curly braces
  • The destination register is on the *left*, just like nasm, the "Intel syntax".

Note also that I've used the "leave" instruction to clean up foo's stack frame (mov esp,ebp; pop ebp;) before returning.  The compiler secretly generates the corresponding function prologue at the start of the function.

Microsoft Outside Variable Access:

In Microsoft Visual C or C++, you can read and write variables from the program by just giving their names.

Simple example:
   void *frame;
   __asm mov ebp,frame;
Complicated example:
int foo(void) {
int joe=1234, fred;
__asm{
mov eax,joe ; eax = joe;
add eax,2 ; eax += 2;
mov fred,eax ; fred = eax
};
return fred;
}
This is clearly very convenient!  But what happens if we try to do the same thing with a variable named "al"?  (Remember, "al" is a register on x86!)

GCC Inline Assembly:

Here's an example of how to declare a little assembly snippet inside C++ code using gcc, the default compiler on Linux/UNIX/MacOS:
int foo(void) {
  __asm__( /* Assembly function body */
"  mov $100,%eax	\n"
"  ret	\n"
  );
  return 1; // never gets here, due to ret
}

(Try this in NetRun now!)

Note that:

  • The keyword is __asm__
  • The assembly code is wrapped in parenthesis.
  • The assembly code shows up as a string
  • There are weird symbols in front of constants ($ means constant) and registers (% means register)
  • DYSLEXIA ALERT: GCC sasembly is abckwards.  The destination register goes at the *end* of the instruction, the "AT&T syntax".
I've linked the text to the NetRun version of this code.

The bottom line is just to use the __asm__ keyword, which takes the assembly code as a big string.  Because the string needs newlines (assembly is line-oriented), even the "macro stringification" trick doesn't help here.

GCC Outside Variable Access:

Accessing outside variables is truly hideous in gcc inline assembly.

Simple example:
    void *frame; /* Frame pointer */
   __asm__ ("mov %%ebp,%0":"=r"(frame));
Complicated example:
int foo(void) {
  int joe=1234, fred;
  __asm__( 
"  mov %1,%%eax\n"
"  add $2,%%eax\n"
"  mov %%eax,%0\n"
:"=r" (fred) /* %0: Output variable list */
:"r" (joe) /* %1: Input variable list */
:"%eax" /* Overwritten registers ("Clobber list") */
  );
  return fred;
}

(Try this in NetRun now!)

The __asm__ keyword can take up to four strings, separated by colons:
  • The assembly code.  Now registers need to be prefixed with "%%", not just "%", to distinguish them from arguments.
  • A comma-separated list of output arguments.  These can go into registers ("=r"), memory ("=m"), etc.
  • A comma-separated list of input arguments.
  • A comma-separated list of overwritten registers ("trashed" registers).  The compiler then knows not to put anything important in these registers.
See the gcc manual for so many hideous details, you'll want to cry.  Seriously, this is so ugly it makes me doubt my faith in Linux.

GCC Whole Function in Assembly

Partly because GCC's inline assembly syntax is so horrible, it's often easier to just write the whole function (argument access, frame setup, and value return) in assembly.  There doesn't seem to be a way to do this in Visual C++, although (in either case) it's easy enough to separately compile a whole file full of pure assembly code and just link it in.

To write a function in assembly, just:
  • Write a C function prototype.  In C++, make the prototype 'extern "C"' to avoid a link error.
  • Put your code in an "__asm__" block outside any subroutine.
  • Put the function name at the start of the assembly block as a label.
  • If you want to call the function from outside that file, use ".globl my_sub" to make the subroutine's name visible outside.
Here's a complete example, where my assembly function just adds 100 to its input parameter:
extern "C" long my_fn(long in); /* Prototype */

__asm__( /* Assembly function body */
"my_fn:\n"
"  mov %rdi,%rax\n"
"  add $100,%rax\n"
"  ret\n"
);

int foo(void) {
   return my_fn(3);
}

(Try this in NetRun now!)

This is actually a pretty clean way to do inline assembly in gcc, although you do have to remember the calling convention (%rdi, %rsi, etc) to find your arguments!

 


CS 301 Lecture Note, 2014, Dr. Orion LawlorUAF Computer Science Department.