CS 441 - Computer Architecture

Meeting time: TR 9:45-11:15am
Room 106 Chapman

University of Alaska Fairbanks

UAF CS F441-F01
3.0 Credits, Fall 2011
Prerequisites: CS 321 (OS), EE 341

Instructor: Dr. Orion Lawlor
lawlor@alaska.edu, 474-7678
Office: 201E Chapman
Hours: 11:30-1:00 TR, by appointment, or just drop by!

Utterly OPTIONAL Textbook:
Computer Organization and Design: The Hardware/Software Interface, David Patterson and John Hennessy, Morgan Kaufmann, 3rd Edition.

Course Website: http://www.cs.uaf.edu/2012/fall/cs441


ADA Compliance: I will work with the Office of Disability Services (208  WHITAKER BLDG, 474-5655) to provide reasonable accommodation to students with  disabilities.

Course Goals and Requirements

By the end of the course, you will be able to understand both the present and future of computer design for performance: parallelism. Specifically, we will cover circuit-level parallelism via circuit simulators; instruction-level transparent parallelism including pipelining, multi-issue superscalar, and out-of-order execution; vector parallelism including SWAR, SIMD, and GPU programming; as well as coarser-grained parallelism including multicore, multi-thread, and distributed-memory network and cloud computing. To understand this, you will need to know at least the following topics from the course prerequisites:


Calendar

Last day to drop: Friday, September 14. Midterm exam: Tuesday, October 16. Last day to withdraw: Friday, October 26. Thanksgiving Break: Thursday, November 22. Last class: Thursday, December 6. Final exam: 8am (!) Thursday, December 13.

Student Resources

Academic Help: Rasmuson Library, Academic Advising Center (509 Gruening, 474-6396), Math Lab (Chapman Room 305), English Writing Center (801 Gruening Bldg).

Grading

Your work will be evaluated on correctness, rationale, and insight. Grades for each assignment and test may be curved up or down if needed. Your grade is then computed based on four categories of work:

  1. HW: Homeworks and machine problems, to be distributed through the semester.

  2. PROJ1: a paper and in-class presentation on an architecture topic of your choice, due in October.

  3. PROJ2: a software development or hardware performance analysis project, due in December.

  4. MT: Midterm Exam, Tuesday, October 16.

  5. FINAL: Final Exam (comprehensive), 8am Thursday, December 13.

Your overall score is then calculated as:
GRADE = 15% HW + 15% PROJ1 + 15% PROJ2+ 25% MT + 30% FINAL
This percentage score is transformed into a plus-minus letter grade via these cutoffs: A >= 93%; A- 90%; B+ 87%; B 83%; B- 80%; C+ 77%; C 70%; D+ 67%; D 63%; D- 60%; F. The grades “C-”, “F+”, and “F-” will not be given. “A+” is reserved for truly extraordinary work.


Students taking the stacked graduate section, CS 641, will have (1) additional reading assignments, (2) additional homework assignments, (3) more complex projects, and (4) produce a publishable journal-quality scientific paper as part of their projects.


Course Rules

At my discretion, I may round your grade up if it is near a grading boundary. Homeworks are due at midnight on the day they are due. Late homeworks will receive no grade credit, but you'll sleep better knowing you did them anyway. At my discretion, I may allow late work without penalty when due to circumstances beyond your control. Everything you turn in must be your own work--violations of the UAF Student Code of Conduct will result in a minimum penalty equal to THAT ENTIRE SECTION OF YOUR GRADE (e.g., one plagiarized homework question will negate an otherwise perfect grade on all homeworks). However, even substantial reuse of other people's work is fine (and not plagiarism) iff it is clearly cited; you'll be graded on what you've added to others' work. Group projects (NOT homeworks) are acceptable iff you clearly label who did what work; but I do expect a two-person group project to represent twice as much work as a one-person project. Department policy does not allow tests to be taken early; but when necessary I may allow them to be taken late. In extraordinary circumstances, such as an ice storm or zombie outbreak, classes may be held electronically via Blackboard/Elluminate Live.


Course Outline (Tentative)

(September: 1950 through 2000 AD)

Physical Parallelism

  • Semiconductors

  • Small circuit simulation with logisim

  • Photolithography: PCB, semiconductor


Performance background:

  • Timing your code in NetRun

Instruction-level Parallelism

  • Pipelining

  • Operand forwarding (register file bypass)

  • Pipeline hazards and data dependencies

  • Superscalar execution (wide issue)

  • Out-of-order execution

  • Branch prediction & speculation


(October: technology post 2000AD)

Multicore Parallelism

  • SMP, SMT, multicore hardware

  • Shared-memory programming with threads in OpenMP

  • Locks and memory race conditions


Speeding Up Memory

  • Cache hardware design, thrashing

  • Cache hit ratio, performance modeling

  • Cache coherence in a multicore world

  • False sharing and multicore cache thrashing

Project 1 Presentations

(Midterm To Thanksgiving)

Vector Parallelism

  • SWAR, SIMD, SSE, and AVX

  • SSE branch instructions & quantum superposition


GPU Programming

  • Barrel processors, MTA

  • OpenGL Shading Language (GLSL)

  • CUDA and OpenCL

  • Non-graphics code on the GPU: GPGPU

(Thanksgiving to End)

Distributed-memory Parallelism

  • Fork & mmap on multicore machines

  • Network interfacing via sockets

  • Clients, Servers, and Peer-to-Peer

  • Clusters and cloud computing

  • MPI, the Message Passing Interface

  • CUDA and EPGPU (Dr. Lawlor research!)


Project 2 Presentations