# Course Review and Cheat Sheet

CS 301 Lecture, Dr. Lawlor

Briefly, you should know both the how and the why of everything we did in class or on the homeworks.

## Writing Numbers in Binary, Decimal, Hexadecimal

 Place/bit Number i ... 4 3 2 1 0 Decimal: Base-10 10i ... 10000 1000 100 10 1 Binary: Base-2 2i ... 16 = 24 8 = 23 4 = 22 2 1 Hex: Base-16 16i ... 65536 = 2 4096 = 163 256 = 162 16 1 Base-n ni ... n4 n3 n2 n 1 = n0

## Bitwise Operations

 Name Purpose C++ Assembly AND Turn bits off, extract bits. x=x&0xf0; and rax,0xf0  (Try this in NetRun now!) OR Turn bits on, combine fields. x=x|0xf0; or rax,0xf0   (Try this in NetRun now!) NOT Invert all bits. x=~x; not rax     (Try this in NetRun now!) XOR Invert selected bits, encryption. x=x^0xf0; xor rax,0xf0    (Try this in NetRun now!) left shift Multiply by 2n.   Reassemble bits. x=x<<2; shl rax,2    (Try this in NetRun now!) right shift (unsigned) Divide unsigned by 2n. Brings in zero bits. unsigned x; x=x>>2; shr rax,2   (Try this in NetRun now!) right shift (signed) Divide signed by 2n. Brings in copies of the sign bit. int x; x=x>>2; sar rax,2   (Try this in NetRun now!)

## Sizes of Stuff

 Name C++ Asm Register Asm Memory Asm Data Bits Bytes (8 bits) Hex Digits (4 bits) Unsigned Range Signed Range Bit n/a n/a n/a n/a 1 less than 1 less than 1 0..1 -1..0 Byte char al BYTE db 8 1 2 255 -128 .. 127 short short ax WORD dw 16 2 4 65535 -32768 .. +32767 32-bit int int eax DWORD dd 32 4 8 >4 billion -2G .. +2G 64-bit int "long" (or "long long") rax QWORD dq 64 8 16 >16 quadrillion -8Q .. +8Q

You can determine the usable range of a value by experimentally measure the overflow point, for example with code like this:
`int value=1; /* value to test, starts at first (lowest) bit */int bit=0;while (value!=0) {	value=value+value; /* moves over by one bit (value=value<<1 would work too) */	bit++;}return bit;`

## Signed versus Unsigned Integers

two's complement signed integers.  The sign bit.  To flip the sign, flip all the bits and then add one.

## Concept of Machine Code

Or, why this code works, and returns 73:

`const char commands[]={	0xb0,73, /* load a value to return */	0xc3 /* return from the current function */};int foo(void) {	typedef int (*fnptr)(void); // pointer to a function returning an int	fnptr f=(fnptr)commands; // typecast the command array to a function	return f(); // call the new function!}`

## The Registers

Here's the full list of x86 registers.  The new 64 bit registers are shown in red.

 Notes 64-bit 32-bit 16-bit 8-bit Values are returned from functions in this register.  Multiply instructions put the low bits of the result here too. rax eax ax ah and al Typical scratch register.  Some instructions use it as a counter (such as bit shifts). rcx ecx cx ch and cl Scratch register.  Multiply instructions put the high bits of the result here, and divide demands that this contain zero. rdx edx dx dh and dl Preserved register: don't use it without saving it! rbx ebx bx bh and bl The stack pointer.  Points to the top of the stack rsp esp sp spl Preserved register.  Sometimes used to store the old value of the stack pointer, or the "base". rbp ebp bp bpl Scratch register.  Also used to pass function argument #2 (on 64-bit Linux). rsi esi si sil Scratch register.  Function argument #1 (on 64-bit Linux). rdi edi di dil Scratch register.  These were added in 64-bit mode, so the names are slightly different. r8 r8d r8w r8b Scratch register. r9 r9d r9w r9b Scratch register. r10 r10d r10w r10b Scratch register. r11 r11d r11w r11b Preserved register. r12 r12d r12w r12b Preserved register. r13 r13d r13w r13b Preserved register. r14 r14d r14w r14b Preserved register. r15 r15d r15w r15b

## Conditional Jumps

These only work after a "cmp" instruction (or another instruction that sets the flags properly).

 Instruction Useful to... Flags (see below) jmp Always jump None ja Unsigned >   (signed "x>n || x<0") CF=0 and ZF=0 jae Unsigned >= (signed "x>=n || x<0") CF=0 jb Unsigned <   (signed "x=0") CF=1 jbe Unsigned <=  (signed "x<=n && x>=0") CF=1 or ZF=1 jc Unsigned overflow checking CF=1 jecxz Compare ecx with 0 ecx=0 je or jz Equality ZF=1 jg Signed > ZF=0 and SF=OF jge Signed >= SF=OF jl Signed < SF!=OF jle Signed <= ZF=1 or SF!=OF jne or jnz Inequality ZF=0 jo Signed overflow checking OF=1 jp or jpe Parity check (even) PF=1 jpo Parity check (odd) PF=0 js Jump if negative SF=1
There are also "n" NOT versions for each jump; for example "jno" jumps if there is NOT overflow.

Often you can avoid a branch entirely using funky arithmetic, like "blended = (1-doit)*A+doit*B;".

Every C flow-control construct can be written using just "if" and "goto", which usually map one-to-one to a compare-and-jump sequence in assembly.

 Normal C Expanded C if (A) {   ... } if (!A) goto END;{   ... } END: if (!A) {   ... } if (A) goto END; {   ... } END: if (A&&B) {   ... } if (!A) goto END; if (!B) goto END; {   ... } END: if (A||B) {   ... } if (A) goto STUFF; if (B) goto STUFF; goto END; STUFF: {   ... } END: while (A)  {   ... } goto TEST; START: {   ... } TEST: if (A) goto START; do {   ... } while (A) START: {   ... } if (A) goto START; for (i=0;i=n) goto END; {   ... } i++; goto START; END: switch (x) { case 0: ...; break; case 1: ...; break; case 2: ...; break; jmp [table + rax*8] label0: ...    jmp end label1: ...    jmp end label2: ...    jmp end end: table: dq label0, label1, label2

## Call, Return, and The Stack

• push == subtract the stack pointer, and move QWORD there.
• pop == move QWORD off stack, and add stack pointer.  Must match with a push!
• call == push address to come back to, and jmp to new function.
Note that if you leave junk on the stack, "ret" will happily try to return there, usually resulting in a crash.

Typical use of push and pop to save a preserved register:
push rbp
... happily trash rbp in my function ...
pop rbp ; restore his version
ret ; <- he'll have no idea that I trashed his register!

Or, to preserve a scratch register across a function call:
push rcx
call ....; he'll trash rcx
pop rcx; bring my version back!new

You can save as many registers as you want, but you need to pop them in the opposite order--otherwise you've flipped their values around!

## Accessing Memory in Arrays

Array element i of the "int" array arr is stored at [arr+4*i].
Array element i of the "long" array arr is stored at [arr+8*i].
Character i of the "char" array arr is stored at [arr+i].

## Accessing Memory in a Class or Struct

A pointer to a class is a pointer to the first thing inside the class.  You can figure out the "offset" of anything inside the class by counting the size of the stuff before it, plus padding inserted for alignment.  Or you can call "offsetof(classname,fieldname)" and let the compiler do the work.

Alignment: a pointer to an object using n bytes must be a multiple of n.  E.g., a pointer to an int (4 bytes) will be a multiple of 4, so 0x1234 is OK, 0x1235 isn't!

## Memory Allocation: Static, Malloc, and the Stack

Static allocation with "dd" (data DWORD, permanently reserves program memory):
`mov DWORD[myInt],7 ; overwrite our intmov eax,DWORD[myInt] ; copy the modified int into eaxretsection .datamyInt:	dd 2 ; "data DWORD" containing this value`

Dynamic allocation with malloc (call "free" to deallocate the space when you're done):

`mov rdi, 4; malloc's first (and only) parameter: number of bytes to allocateextern malloccall malloc; on return, rax points to our newly-allocated memorymov DWORD [rax],7; write constant into memorymov eax,DWORD [rax]; read it back from memoryret`

(Try this in NetRun now!)

Dynamic allocation on the stack (be sure to hand the space back afterwards!):

`sub rsp,8 ; I claim the next eight bytes in the name of... me!mov DWORD [rsp],1492 ; store an integer into our stack spacemov eax,DWORD [rsp] ; read integer from where we stored itadd rsp,8 ; Hand back the stack spaceret `

The stack must always be 8 byte aligned, which is why I'm grabbing 8 bytes, although I only need 4 bytes.  Actually, if you call any functions that do floating point work, the stack actually needs to be 16 byte alignedonce you're inside the function!

## Defining Functions in Assembly, mixing C++ and Assembly

To make an assembly function visible from outside, from your assembly say:

global myfunc
myfunc:

You may need to add underscores, or capital letters, or some other crud depending on the system.

To call this from C++, declare a function prototype:

extern "C" int myfunc(int someParameter);

Your function will find its parameters in registers (64-bit mode) or on the stack (32-bit mode).  The exact registers depend on the machine's calling convention:

 64-bit Linux, Mac 64-bit Windows 32-bit Return value rax rax eax First parameter rdi rcx DWORD[rsp+4] Second parameter rsi rdx DWORD[rsp+8] Third parameter rdx r8 DWORD[rsp+12] More parameters see docs

You can also intermix assembly language inside your C++ code using "inline assembly".  The syntax is nice on Windows: __asm { mov eax,13 }, but hideous and dyslexic on Linux or Mac.