| CS 411 Fall 2012 > Outline & Supplementary Notes for Wednesday, October 3, 2012 |
Example code:
quicksort.cpp.
Two important optimizations for Quicksort are not mentioned in the Levitin text: tail-recursion elimination on the larger recursive call, and switching to Heap Sort if the recursion depth is too large.
Quicksort makes two recursive calls. Whichever call is done second can be written as a tail call, and therefore easily replaced with a loop. Each recursive call targets one part of the list; standard practice is to do the call on the smaller part first, then do tail-recursion elimination on the larger call.
When this optimization is done, the remaining actual recursive call will be given a list at most half the size of the original list. Thus, the worst-case recursion depth becomes \(O(\log\,n)\), rather than the former \(O(n)\). And since the only significant space used by Quicksort is that required for the recursive calls, this optimization makes the additional space usage \(O(\log\,n)\), rather than \(O(n)\)—a much nicer situation.
See
quicksort.cpp
for an implementation of Quicksort that includes this
tail-recursion-elimination optimization
(along with median-of-three pivot selection).
The Achilles heel of Quicksort—particularly in these days of web-based applications and malicious users—is its worst-case performance: \(O(n^2)\) comparisons.
An optimization that eliminates this problem was published in 1997 by David Musser. He called it “introspection”. The idea is that Quicksort is only slow when its recursion depth is large. Here, we count the above eliminated tail calls as recursive calls, for the purposes of determining depth. So we run Quicksort, while keeping track of the depth. If this exceeds some limit (Musser suggested \(2\,\log_2 n\)) then we sort the current sublist with a sorting algorithm having \(O(n\,\log\,n)\) worst-case time.
The usual secondary algorithm is Heap Sort, which we have not covered yet—although you should have seen it in CS 311. For now, all we need to know is that Heap Sort requires \(O(n\,\log\,n)\) time and constant additional space.
Musser named the resulting algorithm Introsort (for “introspective sorting”). It requires \(O(n\,\log\,n)\) time in both the worst and average cases, and it uses \(O(\log\,n)\) additional space in both the worst and average cases. It is not particularly good for nearly sorted data, although if we do intelligent pivot selection, then it is not particularly bad, either. It is only practical for random-access data, due to the pivot-selection and Heap Sort. And it is not stable.
ggchappell@alaska.edu