CS 311 Fall 2020 > Midterm Exam Review Problems, Part 3

CS 311 Fall 2020
Midterm Exam Review Problems, Part 3

This is the third of three sets of review problems for the Midterm Exam. For all three sets, see the class web page, or the following links:

Part 1
Part 2
Part 3 (this document)

Problems

Review problems are given below. Answers are in the Answers section of this document. Do not turn these in.

Algorithms faster than linear time are rare. Why?
One might think that, as computers get faster, we would not have to worry about efficient algorithms any more. And yet, your instructor mentioned this quote from LLoyd N. Trefethen.
The fundamental law of computer science: As machines become more powerful, the efficiency of algorithms grows more important, not less.
Explain the thinking behind this quote.
1. Insertion Sort is one of the slower sorting algorithms. Nonetheless, it is good for something. What is it good for?
2. How does Insertion Sort work as part other algorithms?
1. What does std::move do (the one-parameter version in <utility>)?
2. When do we use std::move?
3. When we use std::move, we are making a promise. What are we promising?
1. What is a comparison sort?
2. Name a sorting algorithm that is not a comparison sort.
Explain the notations \(\Omega\) and \(\Theta\) and their relationship to big-\(O\).
In the context of algorithms, what is Divide and Conquer?
1. Suppose an algorithm uses Divide and Conquer. It takes as input a list of size \(n\). It splits this list into 3 pieces, and performs 3 recursive calls, each taking one of the pieces. In addition, the algorithm performs \(\Theta(1)\) additional work. What is the order of this algorithm?
2. Suppose the algorithm from part a instead does \(\Theta(n)\) additional work. What is the order of the algorithm?
Note. I do not expect you to have the Master Theorem memorized. If a question like this is on an exam, then a statement of the Master Theorem will be provided.
1. When using big-\(O\), we generally do not write “\(O(\log_2 n)\)” or “\(O(\log_{10} n)\)” or “\(O(\ln n)\)”. Rather, we just write “\(O(\log n)\)”, with no base specified for the logarithm. Explain.
2. Under what circumstances do algorithms typically have a “\(\log\)” in their order?
In each part, indicate the order of the given operation, using big-\(O\). Use \(n\) to denote the length of the list. If no algorithm is specified, assume that a good algorithm is used.
1. Printing the first element of an array.
2. Printing every element of an array.
3. Sequential Search on a list.
4. Binary Search on a sorted array.
5. Getting the value of item \(k\) in an array.
6. Sorting an array.
7. Sorting a Linked List.
8. Sorting an array with Quicksort.
9. Sorting an array in which each item is at most 10 positions away from its final position after sorting.
1. Categorize each of the following sorting algorithms as \(O(n\log n)\) or \(O(n^2)\): Bubble Sort, Insertion Sort, Introsort, Merge Sort, Quicksort.
2. For one of the algorithms from part a, the category it goes in might be considered “surprising”. Which one, and why?
1. What does it mean for a sorting algorithm to be stable?
2. Characterize each of the following algorithms as stable or not (when written to maximize efficiency): Bubble Sort, Insertion Sort, Introsort, Merge Sort, Quicksort.
3. Stability is obviously a nice property for a sorting algorithm to have. Why, then, do many libraries implement a sorting algorithm that is not stable?
1. There is no general-purpose comparison sort that lies in any time efficiency category faster than ... what?
2. Briefly explain how we know that the answer to the previous part is true.
3. Insertion Sort can sort a nearly sorted list in linear time. Explain why this fact does not contradict the answers to the previous parts.
Historically, Merge Sort has generally been found to be slower than Introsort. Why, then, has Merge Sort preferred in some situations? Give two reasons.
Quicksort with median-of-three pivot selection is a fine sorting algorithm for most data. Many people point out that the kind of data that makes this algorithm give poor performance, is extremely unlikely. And your instructor agrees with these people. Why, then, does your instructor say that using Quicksort is often a bad idea?
Suppose we analyze Quicksort using the Master Theorem. The input is split into two pieces [\(b = 2\)], two recursive calls are made [\(a = 2\)], and linear-time extra work is done: pivot selection and partition [\(f(n)\) is \(\Theta(n)\), so \(d = 1\)]. Since \(a = b^d\), we are in case 2, and so the algorithm is \(\Theta(n^d\log n)\), that is, log-linear time.
But Quicksort is actually quadratic-time, not log-linear time. What is wrong with the above reasoning?
What important distinctions between arrays and Linked Lists do we need to consider when analyzing the efficiency of algorithms (for example, sorting algorithms) that deal with them?
Suppose we have the following declarations.
```
class Foo {
public:
    void nonConFunc();
    void conFunc() const;
};

Foo nonConObj;
const Foo conObj;

Foo * nonConPtr;
const Foo * conPtr;
```
In each part, indicate whether the give statement will compile. You may assume that all variables have been properly initialized.
1. nonConObj.nonConFunc();
2. nonConObj.conFunc();
3. conObj.nonConFunc();
4. conObj.conFunc();
5. nonConPtr = nonConPtr;
6. nonConPtr = conPtr;
7. conPtr = nonConPtr;
8. conPtr = conPtr;
9. nonConPtr = &nonConObj;
10. nonConPtr = &conObj;
11. conPtr = &nonConObj;
12. conPtr = &conObj;
1. Briefly explain how the Introsort algorithm works.
2. What is the order of Introsort?
3. What is the major advantage of Introsort, over other sorting algorithms?
4. List four important disadvantages of Introsort, compared with other sorting algorithms?
1. What kinds of data does the Radix Sort algorithm work on?
2. Briefly explain how Radix Sort works.

Answers

Algorithms faster than linear time are rare, because an algorithm takes at least \(n\) steps to read all of its input. The only algorithms that can be faster than \(\Theta(n)\) are those that do not need to read all of their input.
It is true that as computers get faster, problems that used to take a long time now take only a little time. However, we want to use our faster computers to solve bigger problems. An efficient, scalable algorithms is generally one that can take advantage of increased power to solve significantly bigger problems. And as problems get bigger, the advantages of efficient, scalable algorithms increase.
1. Despite its slowness in general, Insertion Sort has very good performance on small lists. Insertion Sort is also fast [\(O(n)\)] for nearly sorted data: for example, data in which no item is more than some constant distance from where it needs to be, or data in which only a constant number of items are out of place.
2. Well written sorting functions often switch to Insertion Sort when sorting small lists (with length less than 20, perhaps).
  Insertion Sort is also used as a finishing step in an optimized Quicksort. We write a recursive Quicksort so that it does nothing when given a small list. Calling such a function on a large list results in a nearly sorted list, which we can then finish sorting using insertion sort. This procedure is also used in Introsort, which is based on Quicksort.
1. std::move casts its argument to an Rvalue.
2. We use std::move when we need to copy a variable (or other Lvalue) whose current value will not be used again. Using std::move means that the copy operation can turn into a move operation, which may be much faster.
3. When we use std::move we are promising not to use the current value of its argument again.
1. A comparison sort is a sorting algorithm that gets its information using a comparison function, which can compare any two items in the list to be sorted, and indicate which should come first.
2. Two non-comparison sorts that we discussed are Pigeonhole Sort and Radix Sort. You may also have heard of Bucket Sort and Burst Sort.
We say \(g(n)\) is \(O(f(n)\) if \(g(n)\) is at most some constant multiple of \(f(n)\) for all but small values of \(n\).
For \(\Omega\), we replace “at most” with “at least”: we say \(g(n)\) is \(\Omega(f(n))\) if \(g(n)\) is at least some constant multiple of \(f(n)\), for all but small values of \(n\).
Finally, \(g(n)\) is \(\Theta(f(n))\) if it is both \(O(f(n))\) and \(\Omega(f(n))\), that is, if it lies between two constant multiples of \(f(n)\), for all but small values of \(n\).
Divide and Conquer is an algorithmic strategy. It means to split the input into parts and handle each part separately, often via recursion.
1. We use the Master Theorem. The algorithm splits its input into 3 pieces. Thus, \(b = 3\). It makes 3 recursive calls. Thus, \(a = 3\). It does \(\Theta(1)\), that is, \(\Theta(n^0)\) extra work. Thus, \(d = 0\). And \(3 > 3^0\); that is, \(a > b^d\). Thus, we are in case 3 of the Master Theorem. The order of the algorithm is \(\Theta(n^k)\), where \(k = \log_b a = \log_3 3 = 1\). that is, \(\Theta(n)\).
2. \(b\) and \(a\) are as before. Now we do \(\Theta(n^1)\) extra work. And \(3 = 3^1\). Thus, we are in case 2 of the Master Theorem. The order of the algorithm is \(\Theta(n^d\log n)\), that is, \(\Theta(n\log n)\).
1. When we use big-\(O\), we do not care about multiplication by a constant. For example, we do not write “\(O(20 n)\)”, but rather “\(O(n)\)”; we can ignore the “\(20\)”. And a logarithm to one base is always a constant multiple of a logarithm to some other base. For example, \(\log_{10} n\) = \log_{10}{2}\times log_2 n\). Note that \(\log_{10} 2\) is a constant. Thus, when using big-\(O\), we can ignore the base of logarithms, and so we generally write logarithms without specifying a base.
2. Algorithms with a “\(\log\)” in their order [\(O(\log n)\), \(O(n\log n)\), etc.] are often those that use Divide and Conquer (or the related strategy Decrease and Conquer).
1. \(O(1)\).
2. \(O(n)\).
3. \(O(n)\).
4. \(O(\log n)\).
5. \(O(1)\).
6. \(O(n\log n)\). Use Merge Sort, Heap Sort, or Introsort.
7. \(O(n\log n)\).. Use Merge Sort.
8. \(O(n^2)\).
9. \(O(n)\). Use Insertion Sort.
1. \(O(n\log n)\): Introsort, Merge Sort. \(O(n^2)\): Bubble Sort, Insertion Sort, Quicksort.
2. It might be surprising that Quicksort is \(O(n^2)\)—the slower category — since for a long time many people considered it the fastest and best sort. However, although Quicksort has a terrible worst-case time, it has a very good average-case time. In the meantime, Introsort has surpassed it, and the issues above are now mostly moot.
1. A sorting algorithm is stable if it does not reverse the order of equivalent items. (Two values are equivalent if neither is less than the other.)
2. Stable: Bubble Sort, Insertion Sort, Merge Sort. Not stable: Introsort, Quicksort. Note. Quicksort can be written in a stable form. However, doing this either greatly decreases its efficiency, making it less efficient than Merge Sort, on average. Thus, we generally do not consider Quicksort to be a stable sorting algorithm.
3. Stability is nice, but speed is often nicer. The fastest sorting algorithms (e.g., Introsort) are not stable. Thus, some libraries include multiple sorting functions.
1. There is no general-purpose comparison sort that lies in any time efficiency category faster than \(\Omega(n\log n)\).
2. There are \(n!\) possible orderings of \(n\) items. Since the result of any comparison cuts the number of possible orderings essentially in half, to determine which ordering is the correct one, we may need \(\log_2(n!)\) comparisons, that is, \(\Theta(n\log n)\).
3. We know that around \(n\log n\) comparisons are needed in the worst case. Nearly sorted data are not the worst case for Insertion Sort. To put it another way, sorting only nearly sorted data is not “general”, in the sense of “general-purpose comparison sort”.
Merge Sort has two properties that Introsort does not have:
- Merge Sort is stable.
- Merge Sort works well with data that are not random-access.
Thus, if stability is needed, or if data to be sorted are in a Linked List or other sequential-access data structure, then Merge Sort is preferable to Introsort.
Note. A third advantage of Merge Sort, which did not come up in the first half of the semester, is that its design can make it work much better than Introsort with data accessed over a slow connection, for example data stored in a disk file.
Using Quicksort is often a bad idea for two reasons.
- First, since the rise of the Web, malicious users have become increasingly common. Data that make certain algorithms exhibit poor performance may be unlikely to be generated at random, while being easy for a malicious user to generate.
- Second, even randomly occurring poor performance is increasingly important. When a lone user is writing his own programs, he can probably handle occasional poor performance. But when the proper operation of software is critical to the operations of a company, government, etc., or perhaps a pacemaker, even occasional poor performance can be unacceptable.
Lastly, we should note that the worst-case behavior of Quicksort is now easy to eliminate: use Introsort.
The Master Theorem only applies when the input is broken into nearly equal-sized parts. Quicksort does not necessarily do this.
There are two important distinctions between arrays and Linked Lists, when we deal with algorithmic efficiency.
- Arrays are random-access. Given the index of an array item, we can find it quickly (constant time). In a Linked List, on the other hand, we must follow the chain of pointers (linear time).
- Insertion and deletion of items in the middle of an array is slow (linear time). In a Linked List, however, once we have pointers to the appropriate nodes, we can insert and delete single items in constant time.
1. Compiles.
2. Compiles.
3. DOES NOT COMPILE.
4. Compiles.
5. Compiles.
6. DOES NOT COMPILE.
7. Compiles.
8. Compiles.
9. Compiles.
10. DOES NOT COMPILE.
11. Compiles.
12. Compiles.
1. Introsort is based on Quicksort with smart pivot selection (median-of-three?) and tail-recursion eliminated on the larger recursive call. In addition, Introsort keeps track of the recursion depth. If this exceeds some bound (perhaps \(2\log_2 n\)?), the algorithm switches to Heap Sort for the given recursive call. The final Insertion-Sort pass optimization may also be used.
2. Introsort is log-linear time.
3. When Introsort has an advantage, that advantage is speed. For some common circumstances, Introsort is considered to be the fastest known sorting algorithm.
4. On the downside, Introsort is not stable, requires random-access data, uses significant extra space (logarithmic), and nearly sorted data does not improve its performance.
1. Radix Sort sorts lists of strings (positive integers may be considered as strings of digits, if padded on the left with zeroes).
2. Radix Sort proceeds in a number of passes. Each pass uses a Pigeonhole Sort to sort the list by one of the characters in each string. The first pass sorts by the least-significant character. The second pass sorts by the next-to-least-significant character, and so on, with the last pass sorting by the most-significant character.

CS 311 Fall 2020 Midterm Exam Review Problems, Part 3

Problems

Answers

CS 311 Fall 2020
Midterm Exam Review Problems, Part 3