next up previous
Next: About this document Up: Pipelining and Optimization Previous: Delayed Branch

Code Optimization

The MAL assembler is capable of analyzing the dependencies between instructions and can identify instructions which do not affect branch tests. The most basic form of optimization performed by the assembler is to reorder code to minimize data and control dependencies while producing the same results.

Consider the following MAL code to copy an array in memory:

Loop:   beq     $10, $11, End
        nop
        lw      $12, 0($10)
        sw      $12, 0($14) 
        addi    $14, $14, 4
        addi    $10, $10, 4
        b       Loop
        nop
End:

There is a data dependency between the lw and sw instructions and no-op instructions have been inserted in the delay slots following the beq and b instructions.

The code may be reordered by moving the addi instructions into the delay slots following the branch instructions. For example, the first addi could be moved to fill the delay slot after the b instruction. By moving the second addi instruction between the lw and sw instructions, the load/store data dependency may be eliminated.

The delay slot following the beq is more difficult to fill. There are no other instructions in the loop which can be reordered to fill the delay slot without introducing new data dependencies. An addi instruction on register $0 is added to the delay slot to perform a no-op.

Loop:   beq     $10, $11, End
        addi    $0, $0, 0
        lw      $12, 0($10)
        addi    $10, $10, 4
        sw      $12, 0($14) 
        b       Loop
        addi    $14, $14, 4
End:

The reordered code loop contains 7 instructions and no data dependencies. The original code, including delay slots, requires 8 instructions and contains a data dependency which would add 2 extra clock cycles per iteration. Therefore, the reordering provides a 30% reduction in the execution time of the loop.



CS 301 Class Account
Mon Sep 13 11:15:41 ADT 1999