CPU and Instruction Set Design

CS 441/641 Lecture, Dr. Lawlor

CISC vs RISC vs VLIW

Back in the 1970's, when bits were expensive, the typical CPU encoding used exactly as many bytes as each instruction needed and no more.  For example, a "return" instruction might use one byte (0xc3), while a "load a 32-bit constant" instruction might use five bytes (0xb8 <32-bit constant>).  These variable-sized instructions are (retroactively) called Complex Instruction Set Computing (CISC), and x86 is basically the last surviving CISC machine. 

During the 1980's, folks realized they could build simpler CPU decoders if all the instructions took the same number of bytes, usually four bytes.  This idea is called Reduced Instruction Set Computing (RISC), and was built into MIPS, PowerPC, SPARC, DEC Alpha, and other commercial CPUs.  Here's a good but long retrospective article on the RISC-vs-CISC war, which got pretty intense during the 1990's.   Nowadays, RISC machines might compress their instructions (like CISC), while CISC machines usually decode their instructions into fixed-size blocks (like RISC), so the war ended in the best possible way--both sides have basically joined forces!

During the late 1980's and early 1990's, several companies created even longer instruction machines, called Very Long Instruction Word (VLIW), where basically each part of the CPU has corresponding bits in every instruction.  This makes for very simple decoding, and allows some interesting parallel tricks, but each instruction might be a hundred bytes or more!  Modern graphics processors are typically VLIW internally, and there are several strange digital signal processor chips that are VLIW, but the concept hasn't really caught on for the main CPU.  And any company that produced a successful VLIW chip would have a big problem building an improved processor, since each instruction specifically describes what should happen on each part of the old chip.

Assembly Basics

So here's some assembly language:
        Machine Code:           Assembly Code:
Address Instruction Operands
 0: 55 push ebp
1: 89 e5 mov ebp,esp
3: b8 07 00 00 00 mov eax,0x7
8: 5d pop ebp
9: c3 ret
Here's a typical line of assembly code.  It's one CPU instruction, with a comment:
	mov eax,1234 ;  I'm returning 1234, like the homework says...
(executable NetRun link)

Hardware Implementation

Each of the features of assembly language above corresponds to a hardware circuit structure.  I claim it's useful to know how these work.
The simplest place to start is just to look at the registers and arithmetic.  We need control lines, highlighted in purple here, to activate the different parts of the computation.  Click this image to get a runnable Logisim .circ file.
Manually controlled registers and processor.

Here's the two-register, 12 bit instruction machine we designed in class today.  Conceptually, this is the same as above, except I'm driving all the little control lines from a ROM; this stored program dramatically simplifies the user interface.
Circuit diagram simple CPU: instructions are fetched, decoded, and executed.

Here's a slightly more complex four register machine.  The instruction set is 16 bits per instruction, and is designed for future compatibility for up to 16 registers.  This is vaguely similar to ARM.
Circuit diagram for simple four-register CPU.

Notice how:
Both CPUs above are RISC architectures, which makes the instruction decoder very simple.  In a CISC architecture, you need to keep fetching instructions until you have one complete instruction.