# Parallel Computing Theory

CS 441 Lecture, Dr. Lawlor

Fact: Amdahl's Law limits the speedup of a partially parallelized program to 1/F, where F is the fraction of the program's runtime that is still sequential, even assuming everything else is so parallel it takes no time at all.
speedup = old time / new time = sequential time / parallel time
= 1 / ( F + (1-F)/P ) => 1/F     (as number of processors P approaches infinity)

Question: what fraction of real programs MUST be sequential?  I claim it's only a vanishingly small amount--the user interface probably should be sequential in places, since there's only one user (one mouse, one button).  So Amdahl's law is in some sense true but useless--if the sequential part of the code is a 2-ns button click, a 2-second program has F=1/billion, so the speedup is limited to one billion.  Big deal!

However, it really takes programming effort to parallelize some parts of the program--some algorithms are sequential (you need new algorithms), some hardware like I/O is sequential (you need better hardware), etc.

Mathematically, since
F proportional to 1/pain
then Ahmdal's law really says
speedup is proportional to pain

Which is both a true and useful statement.

## I/O Performance

One recurring problem in parallel computing is I/O: talking to the disk, network, or output devices.  Oddly, most real hardware has wildly different latency (time for one individual operation) and bandwidth (total bytes per second)--you might expect latency = 1/bandwidth, but in practice that first byte is way more expensive than the next bytes.  For example, loading a single byte from RAM requires the whole row to be addressed and sensed, so reading adjacent bytes gets much cheaper.  Sending one byte over the network requires an entire packet, so sending multiple bytes at once decreases the cost substantially.

 Storage Device Latency (sec/request) Requests/sec Bandwidth (bytes/sec) Source Floppy Drive 500ms 2 5KB/s - Horrible Hazy memories Modem 100ms (round trip) 10 5KB/s - Horrible " DVD-ROM 150ms (seek time) 8 5MB/s - Sad Experiment (in class) Flash Drive 0.5ms 2000 13MB/s - Sad Experiment (USB flashdrive) Hard Disk 10ms 100 50MB/s - Decent Experiment (laptop hard drive) Gigabit Ethernet 100us 10,000 100MB/s - Decent Experiment (powerwall) RAM 50ns 20 million 2GB/s - Great Experiment (netrun Pentium 4) Cache 1ns 1 billion 4GB/s - Great Experiment (netrun Pentium 4)