MC97 Lab 4: Control Flow and Scoping (Spring 1997)

Due: April 21, 1997

Objective

This time you'll keep your compiler's general form intact as you evolve it from c3 to c4, but you'll provide for some important additions to the source language:

Of these, while and if statements introduce interesting control flow, for the first time allowing programs that do something other than continue in a straight-line path. Compound statements are important in conjunction with while and if statements, in order to allow a loop body (for example) to perform more than one action. Moreover, introducing compound statements gives us a natural context for variable scoping. The comparison operators are the only addition of no real substance: they are fundamentally similar to the arithmetic operators, and could equally well have been in the compiler the whole time. The only reason to add them now is that they become particularly handy to have now that we have loops and conditional statements.

Some notes on the new language features

We will follow the lead of C, rather than Java, in using integers to represent truth values, rather than having a separate boolean type. If the condition of a while or if evaluates to a non-zero value it counts as true, while a zero value counts as false. The comparison operations should all produce 1 for true and 0 for false.

The syntax of while and if statements should be as in C (and C++ and Java). For example, the controlling expression in each case must be parenthesized, and there is no keyword (like then or do) between the parenthesized expression and the following statement.

Each brace-enclosed compound statement constitutes a new scope, nested inside the surrounding scope. It is legal to redeclare a variable that was declared in some outer, surrounding scope, but not legal to redeclare a variable in the exact same scope. Each variable reference refers to a declaration that textually precedes the reference and is in a scope that has not yet ended at the point of the reference. If there is no such declaration, the variable reference is in error. If there is more than one such declaration, the one in the innermost scope is used. To determine whether a declaration textually precedes a reference, consider any declaration that has an initializer to take place at the end of the initializer. For example, in the program:

main(){
  int x;          // declaration 1
  x = 3;          // reference (a)
  while(x){       // reference (b)
    x = x - 1;    // references (c) and (d)
    int x = x + 5;// reference (e) and declaration 2
    print_int(x); // reference (f)
    }
  {
    print_int(x); // reference (g)
  }
  {
    int x = 2;    // declaration 3
  }
  print_int(x);   // reference (h)
  }
The x declared by declaration 1 is used by references (a), (b), (c), (d), (e), (g), and (h), while the x declared by declaration 2 is used only by reference (f), and the x declared by declaration 3 isn't used at all.

A constant-folding subtlety

It is important for your compiler not to die if the program contains divisions by zero, because they might be in a non-executed part of the program. Thus, you'll have to not constant-fold any divisions where the divisor is zero.

Worse yet, it turns out that SPIM (our MIPS simulator) botched this issue, and won't load in an assembly language program that has a division instruction with an immediate 0 operand. Thus you'll need to be careful not to generate such instructions. (You can use register $zero instead.)

Quality expectations for generated code

Your compiler should include the optimizations from the prior lab. Additionally, it would be desirable to turn relational operators around when appropriate to be able to take advantage of immediate operands. For example, if the source code contains 3>x, it would be better to turn this around to x<3 so that you can generate an instruction like slt $t0, $s0, 3.

For while loops, you should generate code that only executes a (taken) conditional branch each time around the loop, rather than both a (non-taken) conditional branch and an unconditional jump.

For if statements that have no else part, you shouldn't generate code to jump around the non-existent else part at the conclusion of the ``then'' part. It would be ideal to not jump around empty else parts as well.

Don't worry if there are unusual circumstances under which your code will jump (or branch) to a jump instruction, or jump (or branch) to the next instruction, so long as you've taken care to not generate routinely stupid code. (Routinely stupid means something like jumping around a non-existent else part.)

Postlab

Write a lab report showing your c4 modules, highlighting the differences from the code for c3. You should also describe the tests you ran, and any problems uncovered.

Addenda

Sample code from prior lab

Some students had some difficulty with the prior lab. So that you all have a level playing field for this lab, you are welcome to use the following sample solution code from the prior lab. You are also welcome to use your own code!

Sample documentation for scoped Identifier class

To help get you off on a good start with the scoping, I'm providing the interface documentation for my scoped version of the Identifier class. This corresponds to the code I showed on the overhead screen. Note that this Identifier class is nearly, but not quite, compatible with the lab3 Identifier class. In particular, the declare method now no longer allows you to register an arbitrary Object with the Identifier, but instead only ones that implement the Declaration interface. This ensures that there is the leavingScope callback. The LocalInt class should now be changed to implement the Declaration interface; its leavingScope method would deallocate the Register back into the RegisterAllocator.


Course web site: http://www.gac.edu/~max/courses/S97/MC97/
Instructor: Max Hailperin <max@gac.edu>