Assembly Code!

CS 441 Lecture, Dr. Lawlor (copied from my CS 301 lecture notes)

So here's some assembly language:
        Machine Code:           Assembly Code:
Address Instruction Operands
 0: 55 push ebp
1: 89 e5 mov ebp,esp
3: b8 07 00 00 00 mov eax,0x7
8: 5d pop ebp
9: c3 ret
Here's the terminology we'll be using for the rest of the semester:
Here's a typical line of assembly code.  It's one CPU instruction, with a comment:
	mov eax,1234 ;  I'm returning 1234, like the homework says...
(executable NetRun link)

There are several parts to this line:
Unlike C/C++, assembly is line-oriented, so the following WILL NOT WORK:
	mov eax,
1234 ; I'm returning 1234, like the homework says...
Yup, line-oriented stuff is indeed annoying.  Be careful that your editor doesn't mistakenly add newlines!

Instructions

A list of all possible x86 instructions can be found in: The really important opcodes are listed in my cheat sheet.  Most programs can be writen with mov, the arithmetic instructions (add/sub/mul), the function call instructions (call/ret), the stack instructions (push/pop), and the conditional jumps (cmp/jmp/jl/je/jg/...).   We'll learn about these over the next few weeks!

Registers

Here are the commonly-used x86 registers:
There are some other older or newer and much more rarely-used x86 registers:
Size
Register names
Meaning (note: not the official meanings!)
Introduced in
8-bit
al,ah, bl,bh, cl,ch, dl,dh
"Low" and "High" parts of bigger registers
1972, Intel 8008
16-bit
ax, bx, cx, dx, si, di, sp, bp
"eXtended" versions of the original 8-bit registers
1978, Intel 8086/8088
32-bit
eax, ebx, ecx, edx, esi, edi, esp, ebp
"Extended eXtended" registers
1985, Intel 80386
64-bit
rax, rbx, rcx, rdx, rsi, rdi, rsp, rbp,
r8, r9, r10, r11, r12, r13, r14, r15
"Really eXtended" registers
2003, AMD Opteron / Athlon64
2004, Intel EM64T CPUs

x86 is rather unique in that all the smaller registers from bygone eras are still right there as *part* of the new, longer registers.  So for example, this code returns 0x0000AB00, because 0xAB is put into the next-to-lowest byte of eax:
	mov eax,0 ; Clear eax
mov ah,0xAB ; Move "0xAB" into the next-to-the-last byte of eax
(executable NetRun link)

PowerPC

All of the above is for ordinary x86 machines (Intel Pentiums, etc.)  What about for PowerPC machines, like old Macs, the Xbox360 or the PlayStation 3?  Well, the assembly code is very different in the gory details, but in the abstract it is absolutely identical:
        Machine Code:   Assembly Code:
Address Instruction Operands
 0: 38 60 00 07 li r3,7
4: 4e 80 00 20 blr
Like x86, PowerPC machine code consists of bytes, with addresses, that represent assembly instructions and operands.  PowerPC machine code also spends most of its time manipulating values in registers.

Unlike x86, there are 32 PowerPC registers, but the registers have uninteresting names (they're called r0 through r31).  The names of the instructions are different; "li" in PowerPC (Load Immediate) is about like a "mov" in x86; "blr" (Branch to Link Register) serves the same purpose as "ret" in x86.  PowerPC machine code always uses four bytes for every instruction (it's RISC), while x86 uses from one to a dozen bytes per instruction (it's CISC).   Here's a good but long retrospective article on the RISC-vs-CISC debate, which got pretty intense during the 1990's.