Arithmetic In Assembly
CS 301 Lecture, Dr. Lawlor
Simple Assembly
Try out this code in NetRun:
mov eax, 1492 ; Comment
(executable NetRun link)
This is NASM assembly code for x86, which you'll have to select from
the dropdowns. You can also just click on the assembly.
Ok, so this returns 1492. Why--what the heck does this mean? Let's look at it piece by piece:
- mov is the assembler "mnemonic" (memorable human-readable string) for the x86 "move" instruction/opcode.
- eax
is the register we're moving *to*. Note that the assignment
target is the first thing listed. The register eax stores the
return value of a function.
- 1492 is the thing we're moving--the
constant. You can write
constants in decimal (normally) or
hex (by starting with 0x) exactly like C. You can also do
arithmetic and bitwise manipulation in your constants, like "mov eax,
3+(15&7)". You CANNOT do arithmetic with registers, though!
- The semicolon sign begins a comment, NOT a new line! And semicolons are OPTIONAL!
How is this different from C? Well, like most assemblers,
- It's line-oriented. Try putting a newline after the
comma--it doesn't compile (er, assemble). Each line has one
instruction.
- It's not case sensitive. "MoV" and "eAx" work just as well as above.
All the assembler does is take this line, and spit out the
corresponding machine code.
NetRun disassembles the resulting
machine code as:
Disassembly of section .text:
00000000 <foo>:
0: b8 d4 05 00 00 mov eax,0x5d4
5: c3 ret
6: c3 ret
The x86 "mov 32-bit immediate value into register eax" instruction
opcode is 0xB8. It's followed by the 32-bit (4-byte) value to
move, stored in the little-endian byte order that is standard on x86, so 1492 (decimal) becomes 0x000005D4 (hex), and the "0xD4" byte comes first, followed by the higher-value bytes.
You can *run* this machine code from any C/C++ program by getting these
bytes into memory somewhere, and then executing the bytes:
unsigned char my_code[] = {
0xb8, 0xEF, 0xBE, 0x00, 0x00,
0xc3
};
int foo(void) {
typedef int (*fn_t)(void); /* define "fn_t" as a function pointer type */
fn_t my_fn=(fn_t)my_code; /* cast "my_code" array into a function pointer */
return my_fn(); /* execute the "my_code" array */
}
(executable NetRun link)
It's not very common to write machine code by hand like this, but it's
fun! And there are situations where it's useful to write a
program that builds and calls a little piece of machine code, like a Just-In-Time compiler.
Arithmetic In Assembly
Here's how you add two numbers in assembly:
- Put the first number into a register
- Put the second number into a register
- Add the two registers
- Return the result
Here's the C/C++ equivalent:
int a = 3;
int c = 7;
a += c;
return a;
And finally here's the assembly code:
mov eax, 3
mov ecx, 7
add eax, ecx
ret
(executable NetRun link)
Here are the x86 arithmetic instructions. Note that they *all*
take just two registers, the destination and the source.
Opcode
|
Does
|
Example
|
add
|
+
|
add eax,ecx
|
sub
|
-
|
sub eax,ecx
|
imul
|
*
|
imul eax,ecx
|
idiv
|
/
|
idiv eax,ecx
|
and
|
&
|
and eax,ecx
|
or
|
|
|
or eax,ecx
|
xor
|
^
|
xor eax,ecx
|
not
|
~
|
not eax
|
Be careful doing these! Assembly is *line* oriented, so you can't say:
add (sub eax,ecx),edx
but you can say:
sub eax,ecx
add eax,edx
Reading Input in Assembly
You use the "call" instruction to call functions. You can
actually call cout if you're really dedicated, but the builtin NetRun
functions are designed to be a little easier to call. You first
need to tell the assembler that "read_input" is an external
function. All you do is say "extern read_input". Then you
run that function, with "call read_input", and the CPU will execute the
read_input function until it returns. Before read_input returns,
it puts the read-in value into eax, where you can grab it.
So this assembly program reads an integer and returns it:
extern read_input
call read_input
ret
(executable NetRun link)
Be careful, though! The read_input function can and will use all
the other registers for its own purposes. In particular, it's
tricky to call read_input twice to read two numbers, since you need to
stash the first number somewhere other than registers during the second
call!