Goto and Branch Instructions

A jump instruction, like "jmp", just switches the CPU to executing a different piece of code.  It's the assembly equivalent of "goto", but unlike goto, jumps are notconsidered shameful in assembly.  (Dijkstra wrote a paper in 1968 titled "Goto Considered Harmful".  Since then, goto has generally been considered harmful, except in Linux.)

XKCD goto / velociraptor comic


You say where to jump to using a "jump label", which is just any string name with a colon after it.  (The same exact syntax is used in C/C++)

Assembly jump C++ goto
	mov eax,3
jmp lemme_outta_here
mov eax,999 ; <- not executed!
lemme_outta_here:
ret

(Try this in NetRun now!)

	int x=3;
goto quiddit;
x=999;
quiddit:
return x;

(Try this in NetRun now!)

 
In both cases, we return 3, because we jump right over the 999 assignment.  Jumping is somewhat useful for skipping over bad code, but it really gets useful when you add conditional jumps.  These are used after a "cmp" instruction to compare two values.

Instruction Useful to...
jmp Always jump
ja Unsigned >
jae Unsigned >=
jb Unsigned <
jbe Unsigned <=
jc Unsigned overflow, 
or multiprecision add
jecxz Compare ecx with 0
     (Seriously!?)
je Equality
jg Signed >
jge Signed >=
jl Signed <
jle Signed <=
jne Inequality
jo Signed overflow

There are also "n" NOT versions for each jump; for example "jno" jumps if there is NOT overflow.

Conditional Jumps: Branching in Assembly

In assembly, all branching is done using two types of instruction:

Here's how to use compare and jump-if-equal ("je"):

	mov eax,3
cmp eax,3 ; how does eax compare with 3?
je lemme_outta_here ; if it's equal, then jump
mov eax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret

(Try this in NetRun now!)

Here's compare and jump-if-less-than ("jl"):

	mov eax,1
cmp eax,3 ; how does eax compare with 3?
jl lemme_outta_here ; if it's less, then jump
mov eax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret

(Try this in NetRun now!)

The C++ equivalent to compare-and-jump-if-whatever is "if (something) goto somewhere;".

Loops

To loop, you just jump back to the start of the code.  Somewhere you do need a conditional, or you've made an infinite loop!

In this example, we count down on edi until it hits zero. 

mov eax,0 ; sum added here

start: ; loop begins here
  add eax,10 ; add each time around the loop

  sub edi,1 ; loop increment
  cmp edi,0 ; loop test
  jg start ; continue loop if edi>0

ret

(Try this in NetRun now!)

This is line-by-line equivalent to this C++ code: 

int foo(int bar) {
	int sum=0;
	
start:
	sum += 10;
	bar--;
	if (bar>0) goto start;

	return sum;
}
(Try this in NetRun now!)

 Of course, that's very ugly C++ code!  It's more idiomatic to write a "for" loop here:

int foo(int bar) {
	int sum=0;
	for (int count=bar; count>0; count--) 
		sum+=10;
	return sum;
}

(Try this in NetRun now!)

More Complex Control Flow: C--

You can actually write a very peculiar variant of C++, where "if" statements only contain "goto" statements.  My joke name for this assembly-style C++ where you only use "+=" and "*=" arithmetic, and "if (simple test) goto somewhere;" flow control is "C--"!

For example this is perfectly legal C++ in the "C--" style:

int main() {
int i=0;
if (i>=10) goto byebye;
std::cout<<"Not too big!\n";
byebye: return 0;
}

This way of writing C++ is quite similar to assembly--in fact, there's a one-to-one correspondence between lines of C code written this way and machine language instructions.  More complicated C++, like the "for" construct, expands out to many lines of assembly.

	int i, n=10;
for (i=0;i<n;i++) {
std::cout<<"In loop: i=="<<i<<"\n";
}

Here's one expanded version of this C/C++ "for" loop:

	int i=0, n=10;
start: std::cout<<"In loop: i=="<<i<<"\n";
i++;
if (i<n) goto start;

(executable NetRun link)

You've got to convince yourself that this is really equivalent to the "for" loop in all cases.  Careful--if n is a parameter, it's not!   (What if n>=i?)

All C flow-control constructs can be written using just "if" and "goto", which usually map one-to-one to a compare-and-jump sequence in assembly. 

Normal C Expanded C
if (A) {
  ...
}
if (!A) goto END;
{
  ...
}
END:
if (!A) {
  ...
}
if (A) goto END;
{
  ...
}
END:
if (A&&B) {
  ...
}
if (!A) goto END;
if (!B) goto END;
{
  ...
}
END:
if (A||B) {
  ...
}
if (A) goto STUFF;
if (B) goto STUFF;
goto END;
STUFF:
{
  ...
}
END:
while (A)  {
  ...
}
goto TEST;
START: 
{
  ...
}
TEST: if (A) goto START;
do {
  ...
} while (A)
START: 
{
  ...
}
if (A) goto START;
for (i=0;i<n;i++) 
{
  ...
}
i=0;         /* Version A */
goto TEST;
START:
{
  ...
}
i++; 
TEST: if (i<n) goto START;
for (i=0;i<n;i++) 
{
  ...
}

i=0;          /* Version B */
START: if (i>=n) goto END;
{
  ...
}
i++;
goto START;
END:


Note that the last two translations of the "for" concept (labelled Version A and Version B) both compute the same thing.  Which one is faster? If the loop iterates many times, I claim version (A) is faster, since there's only one (conditional) goto each time around the loop, instead of two gotos in version (B)--one conditional and one unconditional.  But version (B) is probably faster if n is often 0, because in that case it quickly jumps to END (in one conditional jump).

CAUTION: Philosophical Content

Nuclear weapons are so powerful, it's generally considered a bad idea to use them for most problems.

Similarly, goto statements are so powerful, it's generally considered a bad idea to use them for most problems.  The problem is when you see a goto, you can't tell if it's being used to simulate a for, while, function call, or what!  Thus goto may be easy enough to write, but it's more difficult to read, and since code in big projects only gets written once but gets read many times by many different people over the years, it's more important to make your code easy to read.

I notice most self-taught programmers (myself included) tend to prefer goto, or its slightly classier cousin the while loop, because they're more general.  But after you mess up a "while" loop enough times, such as by leaving off the loop increment and inadvertently making an infinite loop, you eventually default to using "for" loops, because the compiler-mandated syntax ("for (int i=0;i<n;++i)") is designed to remind you to include each of the parts you need.

 


CS 301 Lecture Note, 2014, Dr. Orion LawlorUAF Computer Science Department.