Course Review for Final Exam
CS 441 Lecture, Dr. Lawlor
You should know:
- Parallel programming interfaces: multiple processes, threads, OpenMP, MPI, and GPU programming
- Parallel hardware: superscalar execution, SSE, multicore, shared memory & memory coherence, networking
- How to parallelize sequential applications across arrays, tasks, loops, etc
- How to estimate & improve the performance of parallel applications
- The physical and practical limits of hardware: heat, power, fabrication, failure rates, programming models
Interesting modern hardware developments:
- Oak Ridge National Lab's Jaguar supercomputer uses 150K Opteron cores to hit 1,382 teraflops.
- Los Alamos National Lab's IBM Roadrunner
supercomputer uses 12K Opteron + 12K Cell cores to hit 1,457
teraflops. Note that the cells have 8 SPE's each, so the total
number of compute units is around 100K as well.
- A recent SuperComputing 2008 paper
used autotuning on a stencil-type computation to hit 2.6 GF on an Intel
CPU, 7 GF on an Opteron, 5 GF on a Sun CPU, 16 GF on a Cell, and an
amazing 36 GF on a GPU. The GPU was also the clear winner in
flops/watt.
- IBM's Power6 CPU (typical superscalar RISC) recently reached 5GHz.
It's not clear this means the GHz race is starting up again, though,
because Intel and AMD's recent CPUs have stuck in the 2-3GHz range.