CS 411 Fall 2012 > Outline & Supplementary Notes for Wednesday, October 3, 2012

CS 411 Fall 2012
Outline & Supplementary Notes for Wednesday, October 3, 2012

Outline

Quicksort [L 5.2; 4.2 in 2nd ed]

How it works
Hoare partition algorithm
Implement Quicksort
Analyze
- Good average-case time: \(O(n\,\log\,n)\).
- Poor worst-case time: \(O(n^2)\).
- High additional space requirements: \(O(n)\).
Optimizations
- Tail-recursion elimination on larger recursive call.
  - Reduces additional space usage from \(O(n)\) to \(O(\log\,n)\).
- Intelligent pivot selection (e.g., Median-of-Three).
  - Improves performance on nearly sorted data.
- Final Insertion Sort pass.
- Introsort: switch to Heap Sort if recursion depth exceeds some bound.
  - Eliminates quadratic-time worst case.

Example code: quicksort.cpp.

Supplementary Notes

Two important optimizations for Quicksort are not mentioned in the Levitin text: tail-recursion elimination on the larger recursive call, and switching to Heap Sort if the recursion depth is too large.

Tail-Recursion Elimination

Quicksort makes two recursive calls. Whichever call is done second can be written as a tail call, and therefore easily replaced with a loop. Each recursive call targets one part of the list; standard practice is to do the call on the smaller part first, then do tail-recursion elimination on the larger call.

When this optimization is done, the remaining actual recursive call will be given a list at most half the size of the original list. Thus, the worst-case recursion depth becomes \(O(\log\,n)\), rather than the former \(O(n)\). And since the only significant space used by Quicksort is that required for the recursive calls, this optimization makes the additional space usage \(O(\log\,n)\), rather than \(O(n)\)—a much nicer situation.

See quicksort.cpp for an implementation of Quicksort that includes this tail-recursion-elimination optimization (along with median-of-three pivot selection).

Switching to Heap Sort

The Achilles heel of Quicksort—particularly in these days of web-based applications and malicious users—is its worst-case performance: \(O(n^2)\) comparisons.

An optimization that eliminates this problem was published in 1997 by David Musser. He called it “introspection”. The idea is that Quicksort is only slow when its recursion depth is large. Here, we count the above eliminated tail calls as recursive calls, for the purposes of determining depth. So we run Quicksort, while keeping track of the depth. If this exceeds some limit (Musser suggested \(2\,\log_2 n\)) then we sort the current sublist with a sorting algorithm having \(O(n\,\log\,n)\) worst-case time.

The usual secondary algorithm is Heap Sort, which we have not covered yet—although you should have seen it in CS 311. For now, all we need to know is that Heap Sort requires \(O(n\,\log\,n)\) time and constant additional space.

Musser named the resulting algorithm Introsort (for “introspective sorting”). It requires \(O(n\,\log\,n)\) time in both the worst and average cases, and it uses \(O(\log\,n)\) additional space in both the worst and average cases. It is not particularly good for nearly sorted data, although if we do intelligent pivot selection, then it is not particularly bad, either. It is only practical for random-access data, due to the pivot-selection and Heap Sort. And it is not stable.