Out-of-Order Execution (OoOE)
The Problem:
CPUs have multiple logic units which can run at the same time. However, if an operand isn't available for an instruction, the CPU "stalls", wasting valualbe processing cycles.
The Solution:
Create a queue that can store the instruction when the necessary operands aren't available so that other instructions that don't need the same operands may execute.
How is it done?
There are several tricks used to allow Out-of-Order Execution to be an effective way of speeding up a processor.
- Register renaming: Registers which are the destinations of instructions are renamed, thus allowing multiple versions of the same register name to be used.
- Instruction Window: This is the queue that holds our instructions waiting to be executed.
- Enhanced Issue Logic: The Issue Logic must be enhanced in order to determine when an instruction should be issued, depending on the readiness of the operand
- Reservation Station: Holds the "records" for the instructions waiting to be executed. Also used in register renaming
- Scoreboarding: Keeps track of data dependancies of each instruction type so that it can predict when the next instruction can be safely executed.
The New Pipeline
Now, the pipeline works in these three steps:
- Issue
- Instructions are initially keptin the standard "first-in first-out" order
- Instructions are then decoded to determine what functions are needed
- If the reservation station at the required instruction is available, the instruction is sent there.
- See if the operands for the file are available, if so, execute.
- "Rename" registers so that multiple instructions can write to the "same" register when actuality, they're writing to separate locations.
- Execute (At the reservation station)
- Wait for and receive operands to arrive at the station.
- Once all operands have arrived, compute using the function associated with the station.
- Write Result
- Broadcast that the instruction has been executed over the Common Data Bus so that other reservation stations can use the reuslt, then write the result to memory, if requested.
Sources
http://camino.rutgers.edu/cs505/lecture5.html
http://en.wikipedia.org/wiki/Out_of_order_execution
http://www.tom.womack.net/computing/ooo.html
http://www.cs.uaf.edu/2007/fall/cs441/lecture/10_02_superscalar.html