Elsewhere: Calling Functions and Branching in Assembly
CS 301 Lecture, Dr. Lawlor
Reminder: Arithmetic In Assembly
Here's how you add two numbers in assembly:
- Put the first number into a register
- Put the second number into a register
- Add the two registers
- Return the result
Here's the C/C++ equivalent:
int a = 3;
int c = 7;
a += c;
return a;
And finally here's the assembly code:
mov eax, 3
mov ecx, 7
add eax, ecx
ret
(executable NetRun link)
Here are the x86 arithmetic instructions. Note that they *all*
take just two registers, the destination and the source.
Opcode
|
Does
|
Example
|
add
|
+
|
add eax,ecx
|
sub
|
-
|
sub eax,ecx
|
imul
|
*
|
imul eax,ecx
|
idiv
|
/
|
idiv eax,ecx
|
and
|
&
|
and eax,ecx
|
or
|
|
|
or eax,ecx
|
xor
|
^
|
xor eax,ecx
|
not
|
~
|
not eax
|
Be careful doing these! Assembly is *line* oriented, so you can't say:
add (sub eax,ecx),edx
but you can say:
sub eax,ecx
add eax,edx
Calling Functions from Assembly
You use the "call" instruction to call functions. You can
actually call C++'s "cout" if you're sufficiently dedicated, but the builtin NetRun
functions are designed to be a little easier to call. You first
need to tell the assembler that "read_input" is an external
function. All you do is say "extern read_input". Then you
run that function, with "call read_input", and the CPU will execute the
read_input function until it returns. Before read_input returns,
it puts the read-in value into eax, where you can grab it.
So this assembly program reads an integer and returns it:
extern read_input
call read_input
ret
(executable NetRun link)
Be careful, though! The read_input function can and will use all
the other registers for its own purposes. In particular, it's
tricky to call read_input twice to read two numbers, since you need to
stash the first number somewhere other than registers during the second
call!
Jumps
A jump instruction, like "jmp", just switches the CPU to executing a
different piece of code. It's the assembly equivalent of "goto",
but unlike goto, jumps are not considered shameful in assembly.
You say where to jump to using a "jump label", which is just any string
with a colon after it. (The same syntax is used in C/C++)
In both cases, we return 3, because we jump right over the 999
assignment. Jumping is somewhat useful for skipping over bad
code, but it really gets useful when you add conditional jumps...
Conditional Jumps: Branching in Assembly
In assembly, all branching is done with two types of instruction:
- A compare instruction, like "cmp", compares two values.
- A conditional jump instruction, like "je" (jump-if-equal), does a goto somewhere if the two values satisfy the right condition.
Here's how to use compare and jump-if-equal ("je"):
mov eax,3
cmp eax,3 ; how does eax compare with 3?
je lemme_outta_here ; if it's equal, then jump
mov eax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret
(Try this in NetRun now!)
Here's compare and jump-if-less-than ("jl"):
mov eax,1
cmp eax,3 ; how does eax compare with 3?
jl lemme_outta_here ; if it's less, then jump
mov eax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret
(Try this in NetRun now!)
The C++ equivalent to compare-and-jump-if-whatever is "if (something) goto somewhere;".
Also, check out the machine code generated for the conditional
jump--the jump destination is encoded as the number of bytes of machine
code to skip over. For example, the "jl" above gets encoded in
machine code like this:
0: b8 01 00 00 00 mov eax,0x1
5: 83 f8 03 cmp eax,0x3
8: 7c 05 jl f <foo+0xf>
a: b8 e7 03 00 00 mov eax,0x3e7
f: c3 ret
The distance to jump, shown in red above, is five bytes, because the
code we're skipping over is five bytes long. Note that a jump
label doesn't show up in machine code at all--it's just used by the
assembler to figure out how far to jump.