MCS-388 Lab 1: Generating Code from ASTs (Spring 2014)

Due: March 5, 2014

Objective

In this lab, you will produce the core of a compiler by defining classes for an object-oriented Abstract Syntax Tree (AST) that can generate assembly language code. (In MCS-287 you worked with ASTs, but they provided methods for evaluating themselves, not for translating themselves into assembly language. That is, you developed an interpreter rather than a compiler.)

Working environment

My default assumption is that you will work within Eclipse on one of the MCS Department computers running OS X. If this is correct, you can stop reading this section.

If you would rather not work within Eclipse, you can unzip the c1.zip file rather than importing it into Eclipse. In a command-line terminal window, change directory into the c1 directory. To build and test the compiler, use the ant command. If you want to specify a different target for ant, you can do that on the command line, for example ant doc

If you would rather work on your own computer, I don't promise I will support you if you run into technical difficulties. In addition to ant, you'll need the clang compiler, which is packaged with the OS X developer tools and is also freely available on the web for other systems.

If you are using a system other than OS X, you will need to edit one of my souce files, Target.java, as described by a comment within that file.

If you are using Windows, it would be best to edit build.xml so that the definition (near the top of the file) for executable is set to testProgram.exe instead of testProgram. Without this, the normal build works fine but the if you use the "clean" target, the executable doesn't get deleted.

Starting point

Rather than starting from a completely blank slate, download c1.zip and import it into Eclipse as an existing project. You will get a project called c1 that contains a build.xml file and a src directory that contains various source files, mostly written in Java. The build.xml file provides rules that the ant tool can use to automate the building and testing of your compiler. I briefly describe this in the next section and will be explaining it more (with a demonstration) at the beginning of the first lab period.

The Java code in the src directory is documented using the javadoc tool, which produces HTML web pages of documentation from the Java code, taking into account specially formatted comments. I've got the documentation linked here as a convenience, though the next section describes how you can generate your own copy. (This will be important once you start modifying the compiler. My copy of the documentation won't include your changes, whereas yours will.) You should read through this documenation as well as the source code and then try it out as described in the next section. If any of the code is mysterious to you, ask me questions; we've got a nice small class. The TestAST.java file contains the main program for this very primitive compiler; the other files provide supporting classes. If you look at the main program, you will see that one way in which this compiler is very primitive is that it has no way to read in the program it is compiling. Instead, it has hardwired into it a single program to compile, created by invocations of the constructors for AST classes. In your testing, you will want to make changes to what program gets compiled by changing what constructors are called. Currently the program consists of the following: print out a line containing the constant 17. You'll see that the relevant classes for this are defined, but no others. For example, there are no AST classes for arithmetic expressions such as sums and products, just for constants and procedure calls.

Building and running the initial compiler

In Eclipse, select build.xml, pop up the context-sensitive menu, and from "Run As" select "Ant Build...". In the "Refresh" tab, check the box for "Refresh resources upon completion." You should be able to leave everything else alone; in particular, the "Targets" tab should be set to the default target, test_ast. Click the "Run" button. You should see output describing the whole multi-step process; near the end, the value 17 should be printed out by running the generated code.

When testing out a compiler, it is never enough just to run the generated code and see that it behaves correctly. You should also manually inspect the code and make sure it looks right; sometimes you will catch a bug in your compiler that didn't wind up causing a misbehavior in testing. In this case, after running ant, you should look at the LLVM assembly language output in testProgram.ll.

You can also use ant to perform other tasks by selecting a different target. In particular, the doc target will produce a doc directory containing javadoc files generated from the Java source code. To look at this documentation, open the file doc/index.html in a web browser.

Changes you need to make

You should add five new AST classes for sum, difference, product, quotient, and remainder expressions, called SumExpr, DifferenceExpr, ProductExpr, QuotientExpr, and RemainderExpr. The constructor for each of these should take two Expr arguments, the left and right operands. Each should be able to generate the appropriate assembly code; this process will involve recursively generating the assembly code for the two operands. Note that your generated code should be very plodding: it should contain the assembly code for one operand, the code for the other operand, and then an arithmetic operation. You might recognize that the arithmetic could all be done at compile time and a single constant emitted. That is an important optimization, known as constant folding. We'll get to it later. The straightforward code is a better model for what is needed when you add variables.

A typical LLVM instruction to add two values together would be as follows:

    %t3 = add i32 %t1, %t2

The instructions for the other arithmetic operations are similar, using the LLVM opcodes sub, mul, sdiv, and srem.

If you write these five new classes directly as subclasses of Expr, you will find they are very similar. To avoid replicated code, factor out the common essence into a new class that they all have as a superclass, and which itself is a subclass of Expr.

Test your additions out by making suitable changes in TestAST.java. You should try programs that contain nested operations, such as a sum of two products. You can even try out all five operations in a single test program.

What you should turn in

Please upload to moodle a zip file of your modified project. Your new classes should include javadoc comments, similar to those in the files I provided.


Course web site: https://gustavus.edu/+max/courses/S2014/MCS-388/
Instructor: Max Hailperin <max@gustavus.edu>