CS 641 - Advanced Computer Architecture

Meeting time: TR 9:45-11:15am
Room 104 Chapman Building
University of Alaska Fairbanks

3.0 Credits, Spring 2009
Prerequisites: CS 441 (arch)

Instructor: Dr. O. Lawlor
ffosl@uaf.edu, 474-7678
Office: 210C Chapman
Hours: 3-4 TR, by appointment, or just drop by!

Recommended Textbook:
Computer Organization and Design: The Hardware/Software Interface, David Patterson and John Hennessy, Morgan Kaufmann, 3rd Edition.

ADA Compliance: Will work with Office of Disabilities Services (203 WHIT, 474-7043) to provide reasonable accomodation to students with disabilities.

Course Website: http://www.cs.uaf.edu/2009/spring/cs641

Course Goals and Requirements

By the end of the course, you will be able to understand both the present and future of computer design for performance. Specifically, we will cover SIMD, out-of-order execution, and speculation; as well as extensive coverage of coarser-grained parallelism, including multicore, multi-thread, distributed-memory, and GPU system design and programming. To understand this, you will need to know at least the following topics from the course prerequisites:


Last day to drop: February 6.  Spring break: March 7-15. Last day to withdraw: March 27. Midterm exam: 9:45am on Thursday, March 5.  Last day of class: Thursday, April 30. Final exam: 8am on Saturday, May 9.

Student Resources

Academic Help: Google, Rasmuson Library, Academic Advising Center (509 Gruening, 474-6396), Math Lab (Chapman Room 305), English Writing Center (801 Gruening Bldg, 478-5246).


You'll get better grades by attending class, doing homework, and understanding the material than by cramming before the exam. Your overall grade comes from:

  1. HW: Homeworks and machine problems, to be distributed through the semester.

  2. PROJ: two research/development projects, with in-class presentations.

  3. MT: Midterm Exam

  4. FINAL: Final Exam (comprehensive)

Your overall score is then calculated as:
GRADE = 15% HW + 30% PROJ + 25% MT + 30% FINAL
This percentage score is transformed into a plus-minus letter grade via these cutoffs: A >= 93%; A- 90%; B+ 87%; B 80%; C+ 77%; C 73%; C- 70%; D+ 67%; D 63%; D- 60%; F. The grades “B-”, “F+”, and “F-” will not be given. “A+” is reserved for truly extraordinary work.

At my discretion, I may round your grade up if it is near a grading boundary. Homeworks are due at midnight on the day they are due. Late homeworks will receive no credit. At my discretion, I may allow late work without penalty when due to circumstances beyond your control. Projects that are up to two days late may be accepted at a 50% grade penalty (e.g., on-time grade: 86%; late grade: 43%). Everything you turn in must be your own work--violations of the UAF Honor code will result in a minimum penalty equal to THAT ENTIRE SECTION OF YOUR GRADE (e.g., one plagiarized homework question will negate an otherwise perfect grade on all homeworks). However, even substantial reuse of other people's work is fine (and not plagiarism) if it is clearly cited; you'll be graded on what you've added to others' work. Group projects (NOT homeworks) are acceptable iff you clearly label who did what work; but I do expect a two-person group project to represent twice as much work as a one-person project. Department policy does not allow tests to be taken early; but in extraordinary circumstances may be taken late. 

Course Topics To Be Chosen From

  • Physical Reality

    • Speed Of Light (SOL)

    • Photolithography: PCB, semiconductor

    • VHDL, Verilog, SPICE simulators

  • Tools to Fight the Von Neumann Bottleneck

    • Pipelining, operand forwarding

    • Dependencies and hazards

    • Branch prediction

    • Wide issue, Out-of-order, Speculation

    • Vector processing, SIMD

    • Cache design, hit rates, pitfalls

  • High-performance software design & measurement

  • Parallel programming models

    • SMP, SMT, multicore / threads, tasks, processes

  • Parallel Hardware / Programming Languages

    • Clusters, MPP / sockets, MPI

    • GPUs, barrel processors, MTA / OpenGL Shading Language

    • FPGAs / VHDL, Verilog

  • Threading serial applications

    • Amdahl's Law

    • Parallel algorithm design

    • Locks, lock contention, deadlock

    • Memory consistency, fences

    • Parallel cache thrashing