Inline Assembly Language

CS 301: Assembly Language Programming Lecture, Dr. Lawlor

It's possible on most compilers to include a little bit of assembly code right inside your C or C++ file, called "inline assembly" because the assembly is inside the C/C++. How you do this depends on what compiler you're using.

The downside of inline assembly is you're tied to a particular combination of hardware instruction set, and software interface to access it. The upside is you can access the full feature set of the bare hardware, without needing to write the entire program in assembly.

Microsoft Inline Assembly:

Here's a simple example in Microsoft Visual C++ inline assembly:

int foo(void) {
  int x=3;
  __asm{
   rol x,5  // bit rotate x by 5 bits
  };
  return x;
}

Note that:

The keyword is __asm
The assembly code is wrapped in curly braces, and uses C++ comments, but is line-oriented
You can directly access C++ variables from assembly

A more complicated example:

int foo(void) {
  int joe=1234, fred;
  __asm{
   mov eax,joe  ; eax = joe;
   add eax,2    ; eax += 2;
   mov fred,eax ; fred = eax
  };
  return fred;
}

This is clearly very convenient! But what happens if we try to do the same thing with a variable named "al"? (Remember, "al" is a register on x86!)

Also, a given variable may be stored in a register, or in memory somewhere. If we use an instruction that will only accept registers, but the variable is actually stored in memory, the code may not compile ("invalid operand type"). The solution is to manually copy the variable into a register, like we did with eax above.

The other big limitation: __asm only works in Microsoft Visual Studio, and only in 32-bit mode, so no "rax" registers, only "eax" registers. In 64-bit mode ("x64" in the drop down), Microsoft now wants you to use intrinsic functions, not assembly.

Defining Functions in Inline Assembly

You get much more control if you define the entire function in assembly language. This is also the only decent way to make inline assembly work on Mac OS X or Linux with the gcc/g++ compiler (the default gnu inline assembly syntax is terrible.)

extern "C" long my_func(long x);

__asm__ (
".intel_syntax noprefix /* use good syntax */ \n\
.text /* make executable */  \n\
my_func:\n\
  mov rax,rdi\n\
  add rax,100\n\
  ret\n\
\n\
.att_syntax prefix\n"
);


long foo(void) {
	int x=5;
	return my_func(x);
}

(Try this in NetRun now!)

Notice that if we're defining the function:

Now we need to declare a function prototype in C++ as extern "C", so we can call the assembly function.

Many linkers also add an underscore before function names, so you'd declare "_my_func:\n" on those systems (e.g., OS X)

We need to obey the system's parameter passing conventions:

32-bit systems normally pass function parameters on the stack (push arg2; push arg1; call fn; pop arg1; pop arg2). Keep in mind rax, rdi, etc haven't been invented yet--use eax, edi, etc.
64-bit Windows passes the first parameter in rcx, then rdx, r8, and r9. Also, rdi and rsi are not scratch registers here.
64-bit OS X or Linux systems (like NetRun) passes the first parameter in rdi, then rsi, rdx, rcx, r8, and r9.

It's usually a good idea to verify these things by building and calling some very simple functions before building something big and complex!

In C++11 mode, you can use the raw string literal syntax to avoid the weird line continuations above. This is what the code looks like for linux (no underscores) in 64-bit mode (parameters in rdi):

extern "C" long my_func(long x);

__asm__ ( R"ASM( .intel_syntax noprefix /* use good syntax */ 
.text /* make executable */  
my_func:
  mov rax,rdi
  add rax,100
  ret

.att_syntax prefix )ASM" );


long foo(void) {
	int x=5;
	return my_func(x);
}

(Try this in NetRun now!)

On windows, the function name will have an underscore in front of it in assembly, and in 32-bit mode you need eax instead of rax, and your function parameters are passed on the stack. So for Windows 32-bit code::blocks, you'd use:

extern "C" long my_func(long x);

__asm__ ( R"ASM( .intel_syntax noprefix /* use good syntax */ 
.text /* make executable */  
_my_func:
  mov eax, DWORD PTR [esp + 4] /* first parameter */
  add eax,100
  ret

.att_syntax prefix )ASM" );


long foo(void) {
	int x=5;
	return my_func(x);
}

The assembler I use on NetRun, nasm, doesn't require or allow the PTR keyword for memory accesses. Most other assemblers, including the Microsoft MASM and Gnu assembler, require the keyword PTR before memory accesses. (If you forget the PTR, Microsoft errors out; Gnu silently moves the memory access!)