c3
will compile the exact same programs as
c2
. Instead, the goal will be to generate higher quality
assembly language output, by using a more sophisticated approach to
code generation. Rather than generating the assembly language code as
the parsing is being done, an Abstract Syntax Tree (AST) data
structure will be built during the parsing, and then the AST object
will be asked to generate the code. The code it generates should have
the following quality improvements:
2+3*4
to a variable should generate
assembly output that simply loads 14 into the register. (This is
known as constant folding.)
add
, without the i on the end, and the
MIPS assembler (or xspim) is smart enough to figure out that with the
right operand a number, instead of a register, you mean an immediate
instruction. What it actually generates at the machine language level
will vary.)
$a0
argument register. Instead, it should be computed directly
into the variable's register or the argument register. (This is known
as targeting.)
$s
registers
and $ra
, only those of these registers that are actually
stored into by the procedure should be saved and restored. Similarly,
the stack frame size should be reduced from 40 to whatever is actually
needed, and if no stack frame is needed at all (because none of these
registers are stored into), the subtraction from and addition to
$sp
should be completely eliminated.
~max/JavaLib/edu/gac/max/mcs388/s2001
I have provided
the following files as a starting point; you can start from your own
c2
files if you'd rather.
The basic idea embodied in these files is that the parser builds an AST, and then the Main simply asks that AST to generate its assembly language code. As you can see from reading the parser, the AST is represented using three abstract classes: Expr (for expressions), Stmt (for statements), and Proc (for procedures). Each of these has a number of concrete subclasses. For example, when an assignment statement is parsed, a new AssignStmt is created, where AssignStmt is a subclass of Stmt. Similarly, when a variable is used as an expression, a VarExpr is created, which is a kind of Expr. There is no reason why these subclasses need to be direct subclasses: if there is some commonality you would like to factor out (for example, among the various kinds of arithmetic expressions, or among those arithmetic expressions with commutative operators), you might want to introduce one or more intermediate levels of class hierarchy. Also be aware that the abstract classes (e.g., Expr) needn't be purely abstract: if there is some functionality you can implement at that level of hierarchy, go right ahead.
From the Main.java file you can see that the Proc class (or its
concrete subclass, MainProc), needs to support a gen method that
produces a String of assembly language code. Presumably this will be
generated by calling a similar gen method on the body statement.
This gen method of the Stmt class should not only return the assembly
language code, but also information about what $s
or
$ra
registers are set by that code, so that the MainProc
class can surround the statement code with suitable prolog and epilog
code that does the right saving and restoring. One reasonable way to
do this is to have the Stmt class gen method take a Hashtable
argument, into which it stores all the registers that need
saving/restoring for its sake. Then the prolog and epilog can be
produced by iterating through the contents of that table.
The gen method of the AssignStmt and PrintIntStmt classes will need to in turn call upon an Expr class gen method to generate code for the expression. In order to achieve targeting, this Expr class gen method should take a destination argument. However, because of nested subexpressions, not all expressions will have a particular target destination. One reasonable solution is to always pass in an instance of a RegisterDestination class, which is a container that can hold a Register (when targeting is desired) or null (when not). If the RegisterDestination object holds null, then the Expr is responsible for choosing a suitable destination register on its own, and then storing it into the RegisterDestination holder. That way the caller gets not only the assembly language code (as the String returned by the gen method) but also the destination register used by that code, as the new contents of the RegisterDestination.
For some expressions, when no specific target register is specified,
an existing register can be used as the destination, such as the
$zero
register for the constant 0, or a $s
register for variables. (In this case, the generated code string
returned would be the empty string.) For other expressions, a
temporary register will need to be chosen for the destination.
Subexpressions may also need temporary registers. In order to support
this, one reasonable design is to have the Expr class's gen method
take a second argument (in addition to the RegisterDestination),
namely the RegisterAllocator from which to allocate temporaries. Any
temporaries that it allocates it should also deallocate before
returning, except the one it stores into the
RegisterDestination.
In order to support constant folding and the use of immediates, it would probably help for the Expr class to also have methods for determining whether an Expr has a constant value and if so, what that value is. (Note that Exprs that aren't explicitly ConstExprs may well have constant values. For example, a SumExpr has a constant value if both of its operands do. The alternative of ensuring that a ConstExpr is produced whenever the value is known to be a constant is probably more trouble than it is worth.)
c3
modules. You should
also describe the tests you ran, and any problems uncovered. Be sure
to be thorough with your testing, because there is lots of potential
in this lab for special cases that need individual testing. You
shouldn't do excessively many tests or excessively complex ones, but
you should carefully choose the tests to cover all your code's cases.