PowerPC Assembly Language

CS 301 Lecture, Dr. Lawlor

PowerPC machines, like pre-200 Macs, the Xbox360 or the PlayStation 3?  Well, the assembly code is very different in the gory details, but in the abstract it is absolutely identical:
        Machine Code:   Assembly Code:
Address Instruction Operands
 0: 38 60 00 07 li r3,7
4: 4e 80 00 20 blr
Like x86, PowerPC machine code consists of bytes, with addresses, that represent assembly instructions and operands.  PowerPC machine code also spends most of its time manipulating values in registers.

Unlike x86, there are 32 PowerPC registers, but the registers have refreshingly uninteresting names---they're called r0 through r31. 
The names of the instructions are different; "li" in PowerPC (Load Immediate) is about like a "mov" in x86; "blr" (Branch to Link Register) serves the same purpose as "ret" in x86.   So you can return the integer 7 like this:
li r3, 7
blr

(Try this in NetRun now!)

Add, like most arithmetic on PowerPC, takes *three* registers: two sources, and a destination.
li r8, 8
li r9, 1000
add r3,r8,r9
blr

(Try this in NetRun now!)

There's a separate instruction named "addi" (add immediate) to add a constant; plain "add" only works on registers.
li r8, 8
addi r3,r8,1000
blr

(Try this in NetRun now!)

PowerPC machine code always uses four bytes for every instruction (it's RISC), while x86 uses from one to a dozen bytes per instruction (it's CISC).   Here's a good but long retrospective article on the RISC-vs-CISC war, which got pretty intense during the 1990's.   Nowadays, RISC machines compress their instructions (like CISC), while CISC machines decode their instructions into fixed-size blocks (like RISC), so the war ended in the best possible way--both sides have basically joined forces!

One effect of fixed-size instructions is you can't load a 32-bit constant in a single instruction:
li r3, 0xabcdef  ; ERROR!  out of range!
blr

(Try this in NetRun now!)

Instead, you break the 32-bit constant into two 16-bit pieces.  They have a dedicated load-and-shift instruction "lis":
lis r3, 0xab       ; "load immediate shifted" (the high half)
ori r3,r3, 0xcdef ; "or immediate" (the low half)
blr

(Try this in NetRun now!)

Accessing Memory

Memory is accessed with the "lwz" (load word) and "stw" (store word) instructions.  Unlike x86, these are the *only* instructions that access memory; you can't do an "add" with one operand in memory!

lwz r3, 0(r1) ; load register r3 from the stack 
blr

(Try this in NetRun now!)

Here I'm writing an integer out to the stack, then reading it in again.

li r7, 123
stw r7, 0(r1) ; store register r7 to the stack
lwz r3, 0(r1) ; load register r3 from the stack
blr

(Try this in NetRun now!)

There are "updating" variants of load and store called "lwzu" and "stwu".  These actually change the value of the pointer used as an address.  For example,this code does two things:
    stwu r7, -4(r1)

  1. Store r7 into memory at address (r1-4).
  2. Modify r1 = r1-4.
Together, this forms the PowerPC equivalent of a "push": it stores to memory, and updates the stack pointer.

Here's an example:

li r7, 123
stwu r7, -16(r1) ; store register r7 to the stack (with push)
lwzu r3, 0(r1) ; load register r3 from the stack
addi r1,r1,16 ; clean up the stack
blr

(Try this in NetRun now!)

Array indexing mostly has to be done manually.  If r5 is the start of the array, and r6 is the index, you have to do something like this:

ori r5,r1,0 ; array pointer==stack pointer
li r6,2 ; array index
mulli r8,r6,4; array index*4
add r8,r8,r5; add base pointer
lwz r3,0(r8); access memory there
blr

(Try this in NetRun now!)

You can combine the add and lwz with a "lwzx":

ori r5,r1,0 ; array pointer==stack pointer
li r6,2 ; array index
mulli r8,r6,4; array index*4
lwzx r3,r5,r8; access memory at base + index
blr

(Try this in NetRun now!)

Calling Functions

You can get into a function pretty easily, with a "b" (branch, like "jmp") instruction:
li r3,99
b _print_int
blr

(Try this in NetRun now!)

Here, _print_int will end with its own "blr", which will jump straight back to main, skipping us.  Getting control back from a function is much trickier.  The problem is a function will end with "blr" (Branch to the Link Register); the Link Register can only hold one value at a time.  So if you just overwrite the Link Register with your own value, you can't return to main! 

So this "bl" (Branch and Link) will return control back to you, but then *keep* returning control back to you, in an infinite loop:
li r3,99
bl _print_int
blr ; Oops! We trashed LR with the "bl" above!

(Try this in NetRun now!)

The sequence of events here is:

  1. Main calls us with "bl foo".  "bl" will overwrite LR to point back to main.
  2. We call "bl print_int".  "bl" will overwrite LR to point back to us.
  3. Print_int returns with "blr".  That transfers control back to LR, which is us.
  4. We return with "blr", but that just transfers control back to us again!
  5.    ... repeat forever ...
The fix is to save the link register before calling any functions, and restore main's value before returning.  Let's try using a preserved register, just to see if it'll work:
mflr r28 ; save main's link register

li r3,99
bl _print_int ; "bl" will overwrite LR, so print_int can return here

mtlr r28 ; restore main's link register
blr ; now this works... sorta

(Try this in NetRun now!)

OK!  Everybody returns correctly now, but main complains we overwrote its preserved data (r28 is preserved).

So now we save the old link register onto the stack:
mflr r0 ; save main's link register...
stwu r0,-32(r1); ... onto the stack

li r3,99
bl _print_int ; "bl" will overwrite LR, so print_int can return here

lwz r0,0(r1); grab main's link register from the stack
addi r1,r1,32 ; restore the stack
mtlr r0 ; restore main's link register
blr ; finally, this works correctly!

(Try this in NetRun now!)

Whew!  The x86 "call" and "ret" are looking a lot better now!

More Info

The IBM 32-Bit PowerPC Programming Environment gives all the instructions in chapter 8.1 and a good overview in chapter 4.  The IBM Compiler Writer's Guide gives the calling conventions.