A jump instruction, like "jmp", just switches the CPU to executing a different piece of code. It's the assembly equivalent of "goto", but unlike the hated goto in C/C++, jumps are not considered shameful in assembly. (Dijkstra wrote a paper in 1968 titled "Goto Considered Harmful". Since then, goto has generally been considered harmful, except in Linux.)
You say where to jump to using a "jump label", which is just any
string name with a colon after it. (The same exact syntax is
used in C/C++)
Assembly jump | C or C++ goto |
mov rax,3 |
int x=3; |
Jumping is somewhat useful for skipping over bad code, but it
really gets useful when you add conditional jumps, like this C++:
if (month >= 5) goto summer;
In assembly,
this gets split up into two instructions: "cmp rax,rcx" compares
the two values a and c, then a "jge summer" jumps if the compare
came out greater than or equal.
In assembly, all branching is done using two types of instruction:
Here's how to use compare and jump-if-equal ("je"):
mov rax,3
cmp rax,3 ; how does rax compare with 3?
je lemme_outta_here ; if it's equal, then jump
mov rax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret
Here's compare and jump-if-less-than ("jl"):
mov rax,1
cmp rax,3 ; how does rax compare with 3?
jl lemme_outta_here ; if it's less, then jump
mov rax,999 ; <- not executed *if* we jump over it
lemme_outta_here:
ret
The C++ equivalent to compare-and-jump-if-whatever is "if (something) goto somewhere;".
Here are the
most useful jump instructions. The conditional versions
almost always happen after a "cmp" compare instruction.
Instruction | Useful to... |
jmp | Always jump |
je | Jump if cmp is equal |
jne |
Jump if cmp is not equal |
jg | Signed >
(greater) |
jge | Signed >= |
jl | Signed <
(less than) |
jle | Signed <= |
ja |
Unsigned > (above) |
jae |
Unsigned >= |
jb |
Unsigned < (below) |
jbe |
Unsigned <= |
jrcxz | Jump if rcx is 0 (Seriously!?) |
jc |
Jump if carry: used for unsigned
overflow, or multiprecision add |
jo | Jump if there was signed overflow |
There are also
"n" NOT versions for each jump; for example "jno" jumps if there
is NOT overflow. See the full list of
x86 jump instructions here.
To loop, you just jump back to the start of the code. Somewhere you do need a conditional, or you've made an infinite loop!
In this
example, we count down on rdi until it hits zero. ; rdi is our first function argument |
This is line-by-line equivalent to this C++ code: int foo(int bar) { int sum=0; start: sum += 10; (Try this in NetRun now!) |
Of course, that's very ugly C++ code! It's more idiomatic to write a "for" loop here:
int foo(int bar) { int sum=0; for (int count=bar; count>0; count--) sum+=10; return sum; }
You can
actually write a very peculiar variant of C++, where "if"
statements only contain "goto" statements. My joke name for
this assembly-style C++ is "C--": you only use "+=" and "*="
arithmetic, and "if (simple test) goto somewhere;" flow control.
For example this is perfectly legal C++ in the "C--" style:
int main() {
int i=0;
if (i>=10) goto byebye;
std::cout<<"Not too big!\n";
byebye: return 0;
}
This way of writing C++ is quite similar to assembly--in fact, there's a one-to-one correspondence between lines of C code written this way and machine language instructions. More complicated C++, like the "for" construct, expands out to many lines of assembly.
int i, n=10;
for (i=0;i<n;i++) {
std::cout<<"In loop: i=="<<i<<"\n";
}
Here's one expanded version of this C/C++ "for" loop:
int i=0, n=10;
start: std::cout<<"In loop: i=="<<i<<"\n";
i++;
if (i<n) goto start;
(executable
NetRun link)
You've got to convince yourself that this is really equivalent to
the "for" loop in all cases. Careful--if n is a parameter,
it's not! (What if n>=i?)
All C flow-control constructs can be written using just "if" and
"goto", which usually map one-to-one to a compare-and-jump
sequence in assembly. Some of these are a little tricky to
figure out though.
Normal C | Expanded C |
if (A) { ... } |
if (!A) goto END; { ... } END: |
if (!A) { ... } |
if (A) goto END; { ... } END: |
if (A&&B) { ... } |
if (!A) goto END; if (!B) goto END; { ... } END: |
if (A||B) { ... } |
if (A) goto STUFF; if (B) goto STUFF; goto END; STUFF: { ... } END: |
while (A) { ... } |
goto TEST; START: { ... } TEST: if (A) goto START; |
do { ... } while (A) |
START: { ... } if (A) goto START; |
for (i=0;i<n;i++) { ... } |
i=0;
/* Version A */ goto TEST; START: { ... } i++; TEST: if (i<n) goto START; |
for (i=0;i<n;i++) { ... } |
i=0;
/* Version B */ START: if (i>=n) goto END; { ... } i++; goto START; END: |
Note that the last two translations of the "for" concept (labelled
Version A and Version B) both compute the same thing. Which
one is faster? If the loop iterates many times, I claim version
(A) is normally faster, since there's only one (conditional) goto
each time around the loop, instead of two gotos in version
(B)--one conditional and one unconditional. But version (B)
is probably faster if n is often 0, because in that case it
quickly jumps to END (in one conditional jump).
Nuclear weapons are so powerful, it's generally considered a bad idea to use them for most problems.
Similarly, goto statements are so powerful, it's generally considered a bad idea to use them for most problems. The problem is when you see a goto, you can't tell if it's being used to simulate a for, while, function call, or some stranger hybrid of these. Thus goto may be easy enough to write, but it's more difficult to read, and since code in big projects only gets written once but gets read many times by many different people over the years, it's more important to make your code easy to read.
I notice most self-taught programmers (myself included) tend to prefer goto, or its slightly classier cousin the while loop, because they're more general. But after you mess up a "while" loop enough times, such as by leaving off the loop increment and inadvertently making an infinite loop, you eventually default to using "for" loops, because the compiler-mandated syntax "for (int i=0;i<n;++i)" is designed to remind you to include each of the parts you need, and give you a handy compile error if you forget any part of it.