main
procedure
and the predefined print_int
procedure), and to include
procedure calls and returns.
Most of the features you introduced in your previous c4
compiler are independent from the new material on procedures. Thus,
if you had problems with the control flow statements or comparison
operations in c4
, you can build c5
from
c3
rather than c4
. (Of course, procedures
are less interesting without conditionals, since you can't do
non-infinite recursion without the ability to test for a base case.)
The only c4
feature you need is scoping. (Even that only
needs to work for two levels, rather than fully generally.)
void
, i.e.,
that don't return any value, together with new statement types for
calling such procedures and returning from them.
The details are explained below, but first let's look at an example
program:
void foo(int x); void bar(){ foo(3); } void main(){ bar(); } void print_int(int val); void foo(int count){ if(count == 0) return; foo(count - 1); print_int(count); }The assembly language code corresponding to this example program is given in the later section on generated code.
The overall structure of a program should now be a list of zero or
more items, each of which is either a procedure declaration or a
procedure definition. (Note that main
is no longer
special; its definition is just like any other procedure definition.
The lexical analyzer should stop recognizing main
, and
similarly print_int
, as a keyword, and instead treat it
as just a normal identifier.)
A procedure declaration specifies the name, return type, and parameters of a procedure, but doesn't provide any body code for it. Here is an example procedure declaration:
void foo(int one, int another);There are two reasons for procedure declarations. One is to provide type information for library procedures not defined in the source file. For example, you'll probably want to use the declaration
void print_int(int value);The other reason for a declaration is provide early information about a procedure that will be defined later in the source file. The reason why this is important is that we will impose a rule that no procedure can be called before it has been either declared or defined. Thus if we want to have two mutually-recursive procedures (only one of which can be defined before the other, but each of which calls the other), we will need to use a procedure declaration.
A procedure definition looks just like a procedure declaration, except that in place of the semicolon it has a brace-enclosed block of statements. It serves as a declaration, but additionally provides for the actual code. We will have a rule that each procedure may only be defined once, but it can be declared as many times as you want, so long as all the declarations are consistent with one another and with the definition (if any). We'll treat procedure definitions as declarations for the procedure before the body of the procedure; that way self-recursive procedures are legal without needing a separate declaration of the procedure first.
You should eliminate one kind of statement, the special
print_int
statement, and in its place introduce a general
procedure call statement. A procedure call is written as the name of
the procedure, followed by a parenthesized list of comma-separated
argument expressions, and then a semicolon. (Thus the previous
print_int
statements become legal procedure-call
statements.) A procedure call is only legal if the procedure has been
declared or defined and the number and type of arguments matches the
parameters in the declaration or definition. (Currently the only
possible type is int
.) Additionally, for procedure calls
used as statements, the declaration or definition must show a return
type of void
.
You should also add one other kind of statement, a return
statement, with the syntax (for now) being that it is just the keyword
return
followed by a semicolon. Execution of this
statement causes the procedure to immediately return.
For simplicity, we will restrict procedures to take at most four
arguments, so they can be passed in the $a
registers.
You may also simplify your compiler, at this first, minimal level of
functionality, by leaving out some of the error checking. (However,
this is just an expedient for getting something working quickly; I
do want you to eventually add in the error checking, and
some of it may be easier to design in from the beginning.)
int
as well as
void
. This will be accompanied by needing a way to call
and return from this new kind of procedure.
To call these procedures, you'll need a procedure call expression,
rather than just a procedure call statement. However, for this level
of functionality (and difficulty), you should restrict the language
syntax in such a way that procedure call expressions can only
occur as top-level expressions, not nested inside any other expression
or inside a procedure call statement. That way you can be sure that
none of the caller-saves registers will be live at the time of the
procedure call, which simplifies matters. Procedure calls appearing
as expressions look just like those appearing as statements; the error
checking is also the same except that now the procedure must be one
that returns an int
.
To return a value from these procedures, you'll add a second form of
return statement, in which an expression appears between the
return
keyword and the semicolon. Ideally you should
check that this form of return
statement is only used in
procedures that are declared as returning int
, and that
the expressionless form is used only in procedures that are declared
as returning void
. Don't worry about checking for
procedures that are declared as returning int
but might
fall off the end without executing a return
statement;
that requires more sophisticated technology to check for, which we'll
discuss in class.
Below is a second example program, using the features described above. Again, the assembly language translation is given in the section on generated code.
int id(int x){ return x; } void print_int(int x); int main(){ if(id(0)) return 1; int three = id(3); print_int(three); return 0; }
bar: subu $sp, $sp, 4 sw $ra, 0($sp) li $a0, 3 jal foo bar_epilog: lw $ra, 0($sp) addu $sp, $sp, 4 jr $ra main: subu $sp, $sp, 4 sw $ra, 0($sp) jal bar main_epilog: lw $ra, 0($sp) addu $sp, $sp, 4 jr $ra foo: subu $sp, $sp, 8 sw $ra, 0($sp) sw $s0, 4($sp) move $s0, $a0 seq $t0, $s0, 0 beqz $t0, if_0 j foo_epilog if_0: sub $a0, $s0, 1 jal foo move $a0, $s0 jal print_int foo_epilog: lw $ra, 0($sp) lw $s0, 4($sp) addu $sp, $sp, 8 jr $ra
print_int
should still be emitted at the beginning.)
For a procedure definition, you should emit the procedure's name as a
label, then the procedure's prolog, body, and labelled epilog. The
general structure is like that of the main
procedure in
prior labs. Upon entry to a procedure, the procedure should move its
arguments from the $a
registers they arrive in to
$s
registers, and thereafter treat them as normal local
variables living in the $s
registers. The epilog should
have a label at the beginning for return
statements to
jump to. (Remember that each procedure will need to use a different
label for this, so that multiple procedures can co-exist.) Another
novelty is that the placing of 0 into the $v0
register
should be omitted from the epilog. Instead, if someone wants to
return 0, they will need to use an explicit return 0;
statement, which places the 0 into $v0
and then jumps to
the epilog.
$v0
and then jump to the epilog
label. There are two ways the return statement could find its
procedure's epilog label:
$a0
, $a1
,
etc.), and then a jal
instruction with the called
procedure's name. Remember that the jal
instruction
implicitly writes into the $ra
register. Thus you will
need to add this register to the set of registers that needs saving.
(The saving/restoring will get done in the calling procedure's prolog
and epilog.)
$v0
to the specified target register, if any. (If no
target register was specified, the value can be provided in
$v0
. This might be relevant if the procedure call were
serving as the test expression for a while
loop or
if
statement.) For example, here is the translation of
the second example program given earlier (again, without print_int):
id: subu $sp, $sp, 4 sw $s0, 0($sp) move $s0, $a0 move $v0, $s0 j id_epilog id_epilog: lw $s0, 0($sp) addu $sp, $sp, 4 jr $ra main: subu $sp, $sp, 8 sw $ra, 0($sp) sw $s0, 4($sp) li $a0, 0 jal id beqz $v0, if_0 li $v0, 1 j main_epilog if_0: li $a0, 3 jal id move $s0, $v0 move $a0, $s0 jal print_int li $v0, 0 j main_epilog main_epilog: lw $ra, 0($sp) lw $s0, 4($sp) addu $sp, $sp, 8 jr $ra
sw
instructions to save to the
stack frame any $t
or $a
registers that are
currently in use. (The latter could arise if this procedure call were
nested inside another one, e.g., f(x, g(y))
.) After the
procedure call, you will need to generate corresponding
lw
instructions. Additionally, somewhere you will need
to allocate space in the stack frame. The simplest approach is to
subtract from the stack pointer before the stores and add back to it
after the loads. (Another approach is pre-allocate space in the
procedure prolog, so that the only adjusting of the stack pointer is
in the prolog and epilog.)
c5
modules, highlighting
the differences from the code for c4
(or
c3
). You should also describe the tests you ran, and any
problems uncovered.