Conditionals: Goto and Branch Instructions

CS 301: Assembly Language Programming Lecture, Dr. Lawlor

A jump instruction, like "jmp", just switches the CPU to executing a different piece of code.  It's the assembly equivalent of "goto", but unlike the hated goto in C/C++, jumps are not considered shameful in assembly.  (Dijkstra wrote a paper in 1968 titled "Goto Considered Harmful".  Since then, goto has generally been considered harmful, except in Linux.)

XKCD goto / velociraptor comic


You say where to jump to using a "jump label", which is just any string name with a colon after it.  (The same exact syntax is used in C/C++)

Assembly jump C or C++ goto
	mov rax,3
jmp derp
mov rax,999 ; <- not executed!
derp:
ret

(Try this in NetRun now!)

	int x=3;
goto derp;
x=999;
derp:
return x;

(Try this in NetRun now!)

 
In both cases, we return 3, because we jump right over the 999 assignment. 
Jumping is somewhat useful for skipping over bad code, but it really gets useful when you add conditional jumps, like this C++:
    if (a >= c) goto summer;

In assembly, this gets split up into two instructions: "cmp rax,rcx" compares the two values a and c, then a "jge summer" jumps if the compare came out greater than or equal.

Here are the most useful jump instructions.  The conditional versions almost always happen after a "cmp" compare instruction.

Instruction Useful to...
jmp Always jump
je Jump if cmp is equal
jne
Jump if cmp is not equal
jg Signed >      (greater)
jge Signed >=
jl Signed <      (less than)
jle Signed <=
ja
Unsigned >      (above)
jae
Unsigned >=
jb
Unsigned <      (below)
jbe
Unsigned <=
jrcxz Jump if rcx is 0
     (Seriously!?)
jc
Jump if carry: used for unsigned overflow, 
or multiprecision add
jo Jump if there was signed overflow

There are also "n" NOT versions for each jump; for example "jno" jumps if there is NOT overflow.  See the full list of x86 jump instructions here.

Conditional Jumps: Branching in Assembly

In assembly, all branching is done using two types of instruction:

Here's how to use compare and jump-if-equal ("je"):

	mov rax,3
cmp rax,3 ; how does rax compare with 3?
je lemme_outta_here ; if it's equal, then jump
mov rax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret

(Try this in NetRun now!)

Here's compare and jump-if-less-than ("jl"):

	mov rax,1
cmp rax,3 ; how does rax compare with 3?
jl lemme_outta_here ; if it's less, then jump
mov rax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret

The C++ equivalent to compare-and-jump-if-whatever is "if (something) goto somewhere;".

Loops

To loop, you just jump back to the start of the code.  Somewhere you do need a conditional, or you've made an infinite loop!

In this example, we count down on rdi until it hits zero.  We also indented the loop body, for readability.

; rdi is our first function argument
mov rax,0 ; sum added here start: ; loop begins here add rax,10 ; add each time around the loop sub rdi,1 ; loop increment cmp rdi,0 ; loop test jg start ; continue loop if rdi>0 ret

(Try this in NetRun now!)

This is line-by-line equivalent to this C++ code: 

int foo(int bar) {
    int sum=0;
	
start:
        sum += 10;

bar--; if (bar>0) goto start; return sum; }
(Try this in NetRun now!)

 Of course, that's very ugly C++ code!  It's more idiomatic to write a "for" loop here:

int foo(int bar) {
	int sum=0;
	for (int count=bar; count>0; count--) 
		sum+=10;
	return sum;
}

(Try this in NetRun now!)

More Complex Control Flow: C--

You can actually write a very peculiar variant of C++, where "if" statements only contain "goto" statements.  My joke name for this assembly-style C++ is "C--": you only use "+=" and "*=" arithmetic, and "if (simple test) goto somewhere;" flow control.

For example this is perfectly legal C++ in the "C--" style:

int main() {
int i=0;
if (i>=10) goto byebye;
std::cout<<"Not too big!\n";
byebye: return 0;
}

This way of writing C++ is quite similar to assembly--in fact, there's a one-to-one correspondence between lines of C code written this way and machine language instructions.  More complicated C++, like the "for" construct, expands out to many lines of assembly.

	int i, n=10;
for (i=0;i<n;i++) {
std::cout<<"In loop: i=="<<i<<"\n";
}

Here's one expanded version of this C/C++ "for" loop:

	int i=0, n=10;
start: std::cout<<"In loop: i=="<<i<<"\n";
i++;
if (i<n) goto start;

(executable NetRun link)

You've got to convince yourself that this is really equivalent to the "for" loop in all cases.  Careful--if n is a parameter, it's not!   (What if n>=i?)

All C flow-control constructs can be written using just "if" and "goto", which usually map one-to-one to a compare-and-jump sequence in assembly.  Some of these are a little tricky to figure out though.

Normal C Expanded C
if (A) {
  ...
}
if (!A) goto END;
{
  ...
}
END:
if (!A) {
  ...
}
if (A) goto END;
{
  ...
}
END:
if (A&&B) {
  ...
}
if (!A) goto END;
if (!B) goto END;
{
  ...
}
END:
if (A||B) {
  ...
}
if (A) goto STUFF;
if (B) goto STUFF;
goto END;
STUFF:
{
  ...
}
END:
while (A)  {
  ...
}
goto TEST;
START: 
{
  ...
}
TEST: if (A) goto START;
do {
  ...
} while (A)
START: 
{
  ...
}
if (A) goto START;
for (i=0;i<n;i++) 
{
  ...
}
i=0;         /* Version A */
goto TEST;
START:
{
  ...
}
i++; 
TEST: if (i<n) goto START;
for (i=0;i<n;i++) 
{
  ...
}

i=0;          /* Version B */
START: if (i>=n) goto END;
{
  ...
}
i++;
goto START;
END:


Note that the last two translations of the "for" concept (labelled Version A and Version B) both compute the same thing.  Which one is faster? If the loop iterates many times, I claim version (A) is normally faster, since there's only one (conditional) goto each time around the loop, instead of two gotos in version (B)--one conditional and one unconditional.  But version (B) is probably faster if n is often 0, because in that case it quickly jumps to END (in one conditional jump). 

CAUTION: Philosophical Content

Nuclear weapons are so powerful, it's generally considered a bad idea to use them for most problems.

Similarly, goto statements are so powerful, it's generally considered a bad idea to use them for most problems.  The problem is when you see a goto, you can't tell if it's being used to simulate a for, while, function call, or some stranger hybrid of these.  Thus goto may be easy enough to write, but it's more difficult to read, and since code in big projects only gets written once but gets read many times by many different people over the years, it's more important to make your code easy to read.

I notice most self-taught programmers (myself included) tend to prefer goto, or its slightly classier cousin the while loop, because they're more general.  But after you mess up a "while" loop enough times, such as by leaving off the loop increment and inadvertently making an infinite loop, you eventually default to using "for" loops, because the compiler-mandated syntax "for (int i=0;i<n;++i)" is designed to remind you to include each of the parts you need, and give you a handy compile error if you forget any part of it.