Looping and Branching in Assembly

CS 301 Lecture, Dr. Lawlor

You can actually write a very peculiar variant of C, where "if" statements only contain "goto" statements.  This is perfectly legal C/C++:
int main() {
int i=0;
if (i>=10) goto byebye;
std::cout<<"Not too big!\n";
byebye: return 0;
}
This way of writing C is quite similar to assembly--in fact, there's a one-to-one correspondence between lines of C code written this way and machine language instructions.  More complicated C, like the "for" construct, expands out to many lines of assembly.

Here's an expanded version of a C/C++ "for" loop:
	int i=0;
start: std::cout<<"In loop: i=="<<i<<"\n";
i++;
if (i<10) goto start;
return i;
(executable NetRun link)

You've got to convince yourself that this is really equivalent to the "for" loop in all cases.  Careful--it's not

All C flow-control constructs can be written using just "if" and "goto", which usually map one-to-one to something in assembly.  For example, in funk_emu, the if/goto combination can be implemented with a compare and jump-if-less instruction sequence--the exact same sequence found in most real assembly languages!

Normal C
Expanded C
if (A) {
  ...
}
if (!A) goto END;
{
  ...
}
END:
if (!A) {
  ...
}
if (A) goto END;
{
  ...
}
END:
if (A&&B) {
  ...
}
if (!A) goto END;
if (!B) goto END;
{
  ...
}
END:
if (A||B) {
  ...
}
if (A) goto STUFF;
if (B) goto STUFF;
goto END;
STUFF:
{
  ...
}
END:
while (A)  {
  ...
}
goto TEST;
START:
{
  ...
}
TEST: if (A) goto START;
do {
  ...
} while (A)
START:
{
  ...
}
if (A) goto START;
for (i=0;i<n;i++)
{
  ...
}
i=0;         /* Version A */
goto TEST;
START:
{
  ...
}
i++;
TEST: if (i<n) goto START;
for (i=0;i<n;i++)
{
  ...
}

i=0;          /* Version B */
START: if (i>=n) goto END;
{
  ...
}
i++;
goto START;
END:

Note that the last two translations of the "for" concept (labelled Version A and Version B) both compute the same thing.  Which one is faster? If the loop iterates many times, I claim version (A) is faster, since there's only one (conditional) goto each time around the loop, instead of two gotos in version (B)--one conditional and one unconditional.  But version (B) is probably faster if n is often 0, because in that case it quickly jumps to END (in one conditional jump).

Flags

The "cmp" instruction tells the subsequent conditional jump about the comparison via the "EFLAGS" register.

The "EFLAGS" register on x86 stores a bunch of flags, as shown on page 73 of the Intel arch manual Volume 1.  The important flags include:
You've also got to be aware of which instructions set which flags.  For example, the "cmp", "and" (bitwise AND), "sub", and "add" instructions set all the flags; "inc" (increment by 1) and "dec" (decrement by 1) set everything but CF; while "mov" and all the jump instructions don't mess with the flags.  It's easy to accidentally overwrite flags you care about, if you leave too much stuff between the time the flag is set and the time it's read!

You can actually look at the flags with the "lahf" instruction, which copies the important bits of EFLAGS into register ah--that is, bits 8-16 of eax get EFLAGS(SF:ZF:0:AF:0:PF:1:CF).

The various funky jump instructions, like "jc" (jump if CF is set), or "jo" (jump if OF is set), also read the EFLAGS register.

Note there's NO way to get at the flags, or to directly call the flag-using instructions in C!  None!  C/C++ compilers ignore integer overflow, and there's no way to fix this in C/C++.