MC28 Lab 5: Red-Black Tree Dictionaries (Spring 1997)

Due: April 28, 1997

Prelab

In this lab you will be using the dictionary datatype defined in section 13.4 of the text, which builds on the red-black tree datatype from section 13.3. Therefore, you will need to be familiar with that material. It will also be important to review the material from chapter 8 on using ``onto'' list arguments to avoid append in doing tree traversals.

In lab

You will build a dictionary that allows a user to find the first and last names and login name for all users with a given last name, or with a given ``wildcarded'' last name. A wildcarded last name is one that for which only the first part is specified, with a * indicating that more can follow. For example, the wildcarded last name Hans* matches not only the name Hans, but also Hanson and Hansen. The names will always be specified as strings, i.e., between double quote marks.

The lab assignment is to become familiar with the provided code, make the few additions to it that are needed, and make timing comparisons between it and the (provided) alternative of simply filtering through a list. The details of what you are supposed to do are given in a comment at the end of the file containing the provided code. That file is in ~max/www-docs/courses/S97/MC28/labs/lab5/lab5.scm, a copy of which is attached. Here is another copy of what you are supposed to do:

  1. You should write the dictionary-insert! procedure and test it by inserting some entries (such as the first few from names) into names-by-last-name and then displaying the dictionary's red-black tree by doing
    (display-ranked-btree (red-black-tree names-by-last-name))
    
    Note that the dictionary-insert! procedure should not be large or complex: it just calls red-black-insert! with the appropriate arguments.

  2. Since you'll want to do tests with varying numbers of names in the red-black tree, you should write a procedure that takes an argument specifying the number of names, n, to be used and does a reset-dictionary! on names-by-last-name and then inserts the first n elements of names into names-by-last-name using the dictionary-insert! procedure you wrote, in conjunction with for-each (as described in section 13.4) and the first-elements-of procedure.

  3. Write binary-search-retrieve as described in 13.4, and then define dictionary-retrieve to call red-black-retrieve on the dictionary's red-black tree using the dictionary's key comparator and extractor. Test this using names-by-last-name. You should be able to, for example, find all the users with your last name, or find all users with last names of the form "Hans*", i.e., starting with Hans.

  4. Now do timing tests comparing dictionary-retrieve with list-retrieve as the number of entries being searched is increased. You will get the most distinct results if the search is one that matches very few names. However, you may also want to experimentally see what effect the number of matches has on the time for each of the two retrieval methods.

Postlab

As usual, write up a lab report that explains all of what you did to an audience generally knowledgeable about Scheme. You should present any data graphically (by, for instance, plotting running time as a function of number of entries being searched) to make it easy to interpret. Be sure not only to address the programming aspect of your work, but also the asymptotic performance comparison you did between retrieval from a list and a dictionary.


Course web site: http://www.gac.edu/~max/courses/S97/MC28/
Instructor: Max Hailperin <max@gac.edu>
Other lab instructor: Karl Knight <karl@gac.edu>