In this lab, you will complete a lexical analyzer for Jay, starting from a skeleton. You will also test your analyzer by running it on various input files and seeing the individual tokens printed out one by one, with their types and values.
Along the way, you will not only learn about the manual construction of a lexical analyzer, you will also improve your ability to read and write Java. For example, you will work with loops that test for exit conditions in the middle or at the end, rather than at the beginning, and you will need to use several classes from the Java API, which presumably will entail consulting the on-line documentation.
Start by downloading and reading the skeleton version of
TokenStream.java. This is
derived from Figure 2.5 on page 26 of the textbook
and from the version on the textbook's supporting web site, but
contains several design improvements and modernizations, which I will
discuss in class. (I later provided a further revised version; see
the class notes for 2006-02-13.)
Your goal is to replace each instance of
... with the
missing portion of Java code. However, you should not try doing all
the changes at once. Instead, plan to incrementally alternate between
programming and testing. What are the minimal changes that it would
take to have something testable? Make those changes first,
commenting out the other missing portions of code, and test. You
might, for example, have an initial version that classifies every input
character as a token of type
OTHER. This is of course
incorrect (in general), but suffices to show that you have some of the
basic machinery working.
To compile and test the lexical analyzer, you would give the
following two commands in a shell, assuming that you have your testing
input in a file called
javac TokenStream.java java TokenStream testInput1
The result of this test should be one line of output for each token, displaying the type and value of the token.
Once the basic machinery is working, add features one by one, testing each as you add it. Keep in mind that you may want to add features from simplest to most difficult, rather than in the order they appear in the file. That way you will have a chance to build up your skills before tackling the harder problems. For example, analyzing two-character operators, which appears quite early in the file, is one of the most challenging features, and hence might best be left for later. You could comment out that whole brace-enclosed block of code until you are ready for that challenge.
Your goal in testing is to show that each feature works correctly, but not to distract yourself with lots of useless repetitions of the same basic test. For example, the Jay program in Figure 2.19 on page 43 would make a poor test input. In contains several repetitions of essentially similar tokens, but has a very small variety of different situations. (For example, notice that all five integer literals in this program are a single digit in length.)
Note that the
isSeparator method in the provided file, taken from
the textbook's web site, has a few extra options beyond those listed
in the textbook's specification of Jay. Apparently this is so it can be used
also later in the book when additional features are added to Jay for
arrays and object-oriented programming. You are welcome to either
leave this as-is or correct it to match the published specification of Jay.
Please turn in a printout of your final version of
should also turn in evidence of your testing of this final version. Remember that your goal
in the tests is to show that each case works, but not to test the same