Powerwall Accounts & Parallelism
CS 441 Lecture, Dr. Lawlor
I've given you all accounts on the UAF Bioinformatics Powerwall
up in Chapman 205. You can SSH in directly from on-campus, but
from off-campus the UAF firewall will block you unless you SSH to port
80 (normally the web port). On UNIX systems, that's like "ssh -p
80 fsxxx@powerwall0.cs.uaf.edu".
Once you're logged into the powerwall, copy over the "mandel" example program with:
cp -r ~olawlor/441_demo/mandel .
Now build and run the "mandel" program like so:
cd mandel
make
mpirun -np 2 ./mandel
You can edit the source code with pico, a friendly little editor (use the arrow keys, press Ctrl-X to exit)
pico main.cpp
You can make a backup copy of the code with
cp main.cpp bak_v1_original.cpp
If you have a UNIX machine, you can view the resulting fractal image with:
xv out.ppm
Or you can copy the files off the powerwall for local display; you may prefer to
convert out.ppm out.jpg
to get a normal JPEG.
Parallelism Generally
Some applications are quite easy to parallelize. Others aren't.
- "Naturally Parallel" applications don't need any
communication. For example, Mandelbrot set rendering is naturally
parallel, because each Mandelbrot pixel is a stand-alone problem you
can solve independently of all the other Mandelbrot pixels. GPUs
can handle naturally parallel applications in a single pass. The
other common term for this is "Trivially Parallel" or "Embarrasingly
Parallel", which makes it sound like a bad thing--but parallelism is both natural and good!
- "Neighbor" applications communicate with their immediate
neighbors, and that's it. For example, heat flow in a plate can
be computed based solely on one's immediate neighbors (new value at
arr[i] is a function of the old value of arr[i-1], arr[i], and
arr[i+1]). Neighbor applications naturally fit into the "ghost
exchange" communication pattern, and for large problem sizes usually
can be tweaked to get good performance.
- "Other" applications have a weirder, often collective
communication pattern. Depending on the structure of the problem,
such applications can sometimes have relatively good performance, but
are often network bound.
- "Sequential" applications might have multiple threads or
processes, but they don't have any parallelism--only one thread/process
can do useful work at a time. Many I/O limited applications are
basically sequential, since ten CPUs can wait on one disk no faster
than one CPU!