CS 441 Project 0

Project Timeline & Deliverables

   February 2018      
Su Mo Tu We Th Fr Sa  
             1  2  3  
 4  5  6  7  8  9 10  
11 12 13 14 15 16 17  <- Project 0 topic (in class)
18 19 20 21 22 23 24  <- Project 0 rough draft
25 26 27 28 

     March 2018       
Su Mo Tu We Th Fr Sa  
             1  2  3  
 4  5  6  7  8  9 10  <- Project 0 presentations
11 12 13 14 15 16 17  <- Spring break
18 19 20 21 22 23 24  <- Project 0 final draft
25 26 27 28 29 30 31

First, I'd like you to describe in class your project topic. We do these out loud so everybody can hear each other's ideas, making it easier to form group projects if you'd like to do so.

Next, I'd like your rough draft code, which should work and do most of what you want, but not necessarily do everything you want to do, or be fully polished or tuned.

The presentation is a short, 10-minute presentation in class (time yourself beforehand!). Your presentation should clearly describe WHO you are, WHAT you did, HOW you did it, and WHY you chose to do it that way. Bring a laptop to project your code, demo, slides, and/or figures, or email me your presentation materials the day before, if you'd like to present from my laptop.

The final code should be fully debugged, polished, tuned, commented, and include at least a short README explaining what it is, and what its results mean. You'll be graded on a combination of ambition, correctness, completeness, and comments/style. Style and clean code count! (This is scheduled well after the presentation, so you can follow up any suggestions or ideas you get during the presentation, and to get us past spring break.)

Typical grade breakdown: project grade = 25% rough draft + 25% presentation + 50% final code

Example Project Topics

Research projects:

Learn about the rationale, history, and advantages/disadvantages of any current deep hardware topic, such as:

Pipelining, especially the very deep pipelines of the Pentium 4 compared to less deep more recent Core i7 pipelines.
Out-of-order execution
Register renaming
Branch prediction, branch history, and execution speculation
Cache prefetching and out-of-order loads and stores

SIMD parallelism
Multi-core, SMP, SMT parallelism
Describe how the design limitations and goals of nonstandard computing platforms differ from conventional computing, such as:

High-performance computing systems, such as Blue Gene or anything on the Top500 list.

Consumer game consoles, such as the PlayStation 4 or Xbox One.
Embedded systems, such as cell phones or microwave ovens.

Pick a hardware-related article from Ars Technica. Explain what they're talking about in detail.
Pick a CPU architecture from sandpile.org. Compare this architecture's hardware design, in terms of achievable performance, with competing architectures.
Describe performance counters, which are useful for understanding code performance and pipelining (see PCL)
Describe a strange fabrication substrate or nonstandard computing scheme, such as Biological Computing, Quantum Computing, self-organizing polymer nanofabrication, etc.
Describe a new or novel data storage architectures, such as perpendicular bit recording, MLC flash, magnetoresistive memory, or nanowire memory.
Describe a semiconductor or PCB fabrication process in detail, such as the problems encountered during as we approach nanometer photolithography, solutions such as extreme UV lithography, or the interelationship between planarization and metal layers in CMOS fabrication.
Describe the historical evolution of some computer architecture, such as SPARC or Motorola's 68000.
Explore the decline and fall of some computer architecture, such as the 1960's Burroughs B5000, or the early 1980's VAX ("All the world's a VAX!" Or, er, it was...), or the desktop PC.

Applied projects:

Build an interesting circuit: extend your HW1 CPU, build a superscalar dependency detection unit, etc.
Hardware performance analysis: benchmark some test programs that demonstrates some aspect of modern hardware, such as:

Out-of-order execution (e.g., reorder instructions manually, compare to automatic reordering)
Branch prediction and execution speculation (e.g., reverse-engineer x86 branch hardware, like compare always-taken branch performance with even-odd branch performance)
Dependency tracking (e.g., benchmark performance benefit from decreasing dependency tree depth)
Cache prefetching and out-of-order loads and stores (e.g., compare cached loads with cached loads matching a previous store)

Define a new instruction set, with a software or circuit simulator.
Write and benchmark some code to perform any interesting task quickly on a particular architecture:

Use bitwise operations to do something simple faster, or do something simple in a fiendishly complex way.
Use assembly language or your knowledge of branch prediction, caching, etc to improve the performance of some program.
Write a dynamic binary translator for any architecture.
Use SSE or AVX instructions to speed up some code with the power of SIMD.
Use OpenMP or pthreads to speed up some code with the power of multicore. (But you must get the right answer!)
Using MPI or sockets to speed up code with the power of clustering.
Use CUDA or OpenCL to speed up some code with the power of the GPU.

Your starting code can be something completely new, something you found on the net (with a citation), an extension of any homework, example from the lecture notes, etc.