MCS-375 Take-home Test 2 Sample Solutions (Fall 2011)

You will analyze AnotherScopedMap class. Unless I let any bugs slip in, it is functionally equivalent to the BadScopedMap and GoodScopedMap classes you worked with previously.

For this problem, you may use the following simplified cost model for Python, which is different from the one you used in the homework. Each use of a dot (.) costs one unit, regardless of whether it is used on the left of an assignment or the right, and regardless of whether it is used to select a method or an attribute. Using a dot on the left of an update assignment operator such as += still only costs one unit, even though an old value needs to be fetched and a new value stored. If the dot operator appears in the body of an if statement, it only incurs one unit of cost if it is actually executed. If the dot operator appears in a loop, it incurs one unit of cost each time it is executed.

Because there are five dots in the initialization method for ScopedMapNode, creating a ScopedMapNode costs five units. This means that the total cost of the put method is 11, because there are also six uses of dots directly in put. This cost of 11 is both the best case and the worst case because there is no variability.

In answering the following questions, you should assume that exitScope is never called at level 0. That is, each call to exitScope matches an earlier enterScope.
1. For each of the following methods, give the best-case and worst-case costs, which may be the same: isLocal, enterScope, retrieve, and getNestingLevel.
  
  Sample solution: isLocal: best = 2, worst = 4; enterScope: best = worst = 1; retrieve: best = 2, worst = 3; getNestingLevel: best = worst = 1.
2. Let's use the variable m to indicate the number of prior method invocations on a particular scoped map, not counting the __init__ method done when the scoped map was created. Give the best-case and worst-case costs, expressed as functions of m, for exitScope.
  
  Sample solution: best = 2; worst = 12(m-1)+2
3. Let's use the variable n to indicate the number of distinct ScopedMapNodes reachable starting from a particular scoped map. Give the best-case and worst-case costs, expressed as functions of n, for exitScope.
  
  Sample solution: best = 2 if n = 0, otherwise 5; worst = 12n+2
4. Let's use the variable k to indicate the number of distinct ScopedMapNodes reachable starting from a particular scoped map, counting only nodes that have the same level as the scoped map does. Give the best-case and worst-case costs, expressed as functions of k, for exitScope.
  
  Sample solution: best = 12k+2; worst = 12k+5
5. Professor de Javue has decided to use the potential-function method to analyze the amortized cost of these methods. She is sure that a suitable potential function will have one of the following forms: Cm, Cn, or Ck, where in each case, C is a positive constant that she will choose later in her analysis. The variables m, n, and k are as defined in the previous parts of this problem. Which form should she use for the potential function? Why?
  
  Sample solution: The professor should use a function of the form Cn. Given the limitation to three options, one way to see this is the correct one is by ruling out each of the other two. A function of the form Cm can't work because it always increases; there is never a drop in potential available to help pay for an expensive exitScope. A function of the form Ck likewise can't be counted on to drop when an exitScope is performed: the newly current level after the exitScope might have just as many nodes, or even more, than the exited level did. In that case the potential would stay the same or even go up.
6. Using the form of potential function you chose in the previous part, which methods change the potential? For each of these methods, give a formula for how much the potential changes. Each formula should involve C and may also involve one of the variables m, n, and k. Each formula should have a positive value if the potential increases and a negative value if the potential decreases.
  Sample solution: The method put creates one more accessible node, so it would change the potential by C. The method exitScope makes nodes k nodes inaccessible, so it causes a potential change of −Ck
7. What would be a suitable value for C so that the amortized cost of each method has a constant bound?
  
  Sample solution: In order to get a constant amortized cost bound for exitScope, we must have C = 12.
A two-person game is played using a sequence of positive integers. The two players take turns until the sequence is empty. When it is each player's turn, he chooses either to take the first integer from the sequence or to take the first two integers from the sequence. Each player's goal is to maximize the sum of the integers he takes over the full game. We will call the value of a game the difference between the first player's sum and the second player's sum, assuming that both players play optimally. For example, if the sequence is 4, 3, 2, 1, then the first player's optimal choice is to take two integers (4 and 3); the second player's optimal response is also to take two integers (2 and 1), which ends the game. The value of this game is (4+3)−(2+1), which equals 4.
1. Give an example sequence of positive integers in which the first player gets a worse sum by being greedy and taking two integers than he would by taking one integer. Assume that after this first move, both players play optimally.
  
  Sample solution: 5, 4, 1, 9
2. I have linked a recursive, top-down version of gameValue procedure. Write either a bottom-up dynamic programming version or a memoized top-down version. Your version should produce the same values as mine does for short lists of positive integers. However, your version should complete in a reasonable amount of time even for longer lists.
  
  Sample solution: see gameValue-soln.py
Modify the DisjointSets implementation we programmed so that there is one new method, size. If d is an instance of the DisjointSets class, then d.size(v) should return the size of the set to which the value v belongs. If v is not in any set, a KeyError should be raised. In addition to writing the new method, you may change the stored representation and any other methods. However, you should not change the asymptotic running time of any of the existing methods.

Sample solution: see DisjointSets.py
Suppose the SkipList class we programmed was changed so that it kept track of one more node in addition to the start and end nodes; this node is the last node. This would just be an additional pointer to one of the nodes on the list, the one that comes immediately before the end node. (Note that if the list is empty, the last node is the same as the start node.)

To be sure you understood the previous discussion, note that the following property should always hold for any SkipList s: s.last.successors[0]==s.end
1. Which method or methods can now be written more efficiently by making use of last?
  
  Sample solution: max
2. Which method or methods would now need to be do extra work in order to keep last up to date?
  
  Sample solution: __setitem__, __delitem__, and __clear__; depending on how "keep ... up to date" is understood, perhaps also __init__
3. For each method you listed in the first two parts of this problem, what would the new asymptotic running time of the method be? (Give big-Theta answers in terms of n, the number of key/value pairs in the SkipList.) Is each of your answers any different than the running time before the change? Explain your answers. Your explanations may include modified Python code, but they need not.
  
  Sample solution: The expected time for __setitem__ and __delitem__ methods would remain Θ(lg n), and clear would remain Θ(1). In the case of clear the new last node is always the start node. In the case of insertion with __setitem__ and deletion with __delitem__, the inserted or deleted node, its predecessor, and its successor are all already accessible in constant time within the method. Thus, testing whether the inserted or deleted node is the last one and updating the last pointer can both be done in constant time. The max method would be reduced from Θ(lg n) expected time to Θ(1) because it no longer needs to traverse the various levels of lists; it can go directly to the relevant node.