MC97 Lab 1: A First Compiler (Spring 1997)

Due: February 24, 1997


In this lab, you'll familiarize yourself with the tools we'll be using.

First you'll run a simple compiler I provide, called c0, and you'll re-build it from its source files.

Then you'll evolve c0 into a marginally more capable compiler, c1, which can compile larger programs than c0, and which has an input syntax compatible with C/C++. This is just a first step in a progression that will continue all semester long, ending with c6 in May.

Setting your CLASSPATH

The Java system we'll be using (on the SGIs and Linux PCs) uses an environment variable called CLASSPATH to tell it where to look for Java classes. If you don't set it, only the main system-wide class library is searched. In order to be able to access the packages I'm providing, you'll need to put ~max/JavaLib into your CLASSPATH. Furthermore, it will be most convenient if you make a directory for your own packages and put that in your CLASSPATH as well. Assuming you don't have anything else you want in the CLASSPATH other than these two, and that you call your directory JavaLib, you could set your CLASSPATH as follows:
setenv CLASSPATH ~max/JavaLib
You can do this in a shell window, but you'll find it convenient to put it in your .login file so it is done automatically every time.

Trying out c0

Assuming you've got your CLASSPATH set, you can run my c0 as follows. Put an arithmetic expression in a file (let's say foo). The expression can use numbers, parentheses, and the operators +, -, *, /, and %. There can be white space, but no comments. There shouldn't be anything but the expression (for example, no semicolon). Now compile foo into MIPS assembly language, with the result going into foo.s, by executing
java EDU.gac.max.MC97.S97.c0.Main <foo >foo.s
Now you can look at foo.s, and you can load it into xspim and run it. (Note that I don't recommend that you type the input expression directly in, but rather that you put it in a file. If you do type it directly in, you'll need to type ctrl-d repeatedly at the end.)

Rebuilding c0

Copy the c0 package into your own JavaLib (or whatever you've called it). The easiest way is with a shell command like
(cd ~max/JavaLib; tar cf - EDU/gac/max/MC97/S97/c0) \
 | (cd ~/JavaLib; tar xf -)

Now change directory to your EDU/gac/max/MC97/S97/c0 and rebuild the classes from the source files as follows:

  1. Run the java_cup parser generator on the parser.cup source file, producing the machine-generated source files and
    java java_cup.Main <parser.cup
  2. Run the JavaLex lexical analyzer generator on the scanner.lex source file, producing the machine-generated source file
    java JavaLex.Main scanner.lex
    Note: JavLex is unreasonably slow; don't be surprised when the above command takes a while.
  3. Finally, compile the three .java files resulting from the above two steps, as well as the handwritten file:
    javac *.java
Assuming you put your JavaLib directory ahead of mine in your CLASSPATH, as I suggested, you should now wind up running your own rebuilt c0 if you run c0 again.

Evolving c0 into c1

Now that you have taken all the tools out for a test drive, it is time for you to make a c1 package, copy the three source files into it, change the package lines at their tops to reflect the move, and then evolve the program

The language c1 accepts should allow programs of the following form:

   print_int(7+-2*2);  // a comment could be here
(Note that the gcc C compiler will also handle these same programs. For that reason, I suggest you put a .c extension on the filenames of your test input files, to make it easy to compare results. Of course, you'll need to link the C version with a file that provides print_int.)

The difference from c0 are as follows:

Evolving into c1 will require changes in both the parser (parser.cup) and the lexical analyzer (scanner.lex). In understanding what parser.cup does, you'll need some awareness of the interfaces of two support classes, Register and RegisterAllocator, but you needn't look at their implementations. (The implementations may well change, and even the interfaces are likely to change in the course of the semester -- hopefully in upwardly compatible ways.) If you want to look at the implementations in and, you can, of course.


Write a lab report showing your c1 parser and scanner, highlighting the differences from the code for c0. You should also describe the tests you ran, and any problems uncovered.

Course web site:
Instructor: Max Hailperin <>