In this lab, you will add variables to your compiler's source programming language, with the ability to declare variables, assign values to them, and use them in expressions.
c2 directory from the previous lab as
c3, because your new
compiler will build on your work from the prior lab. If you had
difficulty with the previous lab, you should talk with me to make sure
you have a solid foundation to build on.
Initially you should add just simple declaration statements and assignment statements to your language. A declaration statement at this point should contain only a single variable and not have any initializer. An example would be
An assignment statement assigns the value of an expression to a variable. An example would be
fellerNumber = 10 + 7;
In order to add these features to the source language, you will
need to make additions to the lexical analyzer and parser, as well as
adding two new subclasses of
Stmt for declaration
statements and assignment statements.
generate method for a declaration statement should
return an empty string of assembly code, or possibly a string that
just consists of a comment, if you want to have something visible to
humans. (A comment in MIPS assembly language starts with
# and extends through the end of the line.)
generate method for a declaration
statement returns empty assembly code, that doesn't mean that this
method has no effect. The method should insert the declared name into
a symbol table along with a register allocated to hold that variable.
(For now, you can allocate a
$t register in the same way
as in other parts of the compiler.)
Because the symbol table is used to communicate between
declarations and uses, such as in assignment statements, it needs to
be accessible to all these AST nodes. The simplest reasonable approach to the symbol table would to use a
SymbolTable, which you define with just
static methods (for declaring a variable and for looking up its
register) and a static variable that holds the table itself. Using
the modern Java generics, a declaration for the table might show that
it maps strings (names of variables) to strings (names of registers):
private static Map<String,String> table = new HashMap<String,String>();
(This presumes you have
imported the names
You can temporarily ignore the possibilities that a program might declare the same variable twice or might use a variable (in an assignment statement) without a previous declaration.
You should test your compiler at this stage in its
evolution. Because you have not yet added the ability to use
variables in expressions, you will need to look at the assembly
language output (in
output.s) to see if assignment
statements are correctly moving their expressions' values into the
The remaining portions of this lab are independent of each other and so can be done in any order. You should test your work as you complete each section.
Having paused for testing at the first point where the language was
at all usable, you can now proceed to make the language actually
useful. Modify the grammar to allow a variable to serve as an
expression and create a corresponding new subclass of
Expr. You should now be able to test using programs that
declare variables, assign values to them, and then make use of the
variables in further computations.
Now you need to cope with the possibility that a program being compiled might contain repeated declarations for the same variable and might contain uses of undeclared variables. You should report an error message for each of these situations.
Error messages should be specific, which means they should include the name of the variable in question and also the position within the input at which the variable was redeclared or used undeclared. To get the position, you will need to pass additional information from the parser into the AST node constructor. In the parser, if a grammar production refers to a terminal symbol using something like
then in the accompanying Java action code, you can not only make
name to refer to
the attribute of the
IDENTIFIER token, you can also use
nameleft to refer to the character position at which that
Reporting up to five error messages per run of your compiler is
reasonable. After the fifth error message, the compilation should be
terminated without checking for any further errors. If any errors at
all are found (even less than five), the compilation should terminate
without producing any assembly output and with a non-zero exit code,
which will cause
ant to stop without running
spim. These features will be easiest to add if you
provide a central error-reporting method, perhaps a static method in
If a variable is reported as undeclared, the user will generally not be interested in seeing further error messages regarding that same variable. The easiest way to suppress further messages for the same variable is to pretend the variable is declared.
Rather than only allowing variables to be declared one at a time, and rather than forcing initialization to take place in separate assignment statements, it would be nice to support a syntax like
int x, fellerNumber = 10 + 7, answer = fellerNumber + 25;
This change can be made purely in your lexical analyzer and parser; you can generate the same AST as if the declaration statement had been broken apart into
int x; int fellerNumber; fellerNumber = 10 + 7; int answer; answer = fellerNumber + 25;
Instead of using the
$t registers for variables as
well as other purposes, you can use
$s registers for the
variables. At its core, this change is an easy matter of using a
different name allocator. However, you should also change the
generate method in the
Procedure class so
that it saves into the stack frame all the
that are used and at the end restores them all. The amount of stack
space allocated and deallocated should also be adjusted accordingly.
Please turn in a printout of the final version of any files you modified or added. You should also turn in evidence of your testing of the final version.