Tucker and Noonan show Abstract Syntax Tree classes that are just structures that can hold data, without any methods operating on that data. In order to perform operations such as checking validity, determining types, or (later in the book) evaluating expressions, they use separate procedures that operate on the data. This is a very old-fashioned style of programming that was already well supported by languages such as Pascal and C. We'll look at a simpler example of this style and consider some of its shortcomings. Then we'll look at alternative rewrites of this example into more modern styles, which are well supported by languages like Java. In subsequent days we'll continue this examination and also briefly consider some yet-more-modern styles that other languages beyond Java support.
We can call Tucker and Noonan's style a "structures + procedures" style, or just "procedural" for short. Using this procedural style, we can define a simple type of expressions with three subtypes: sums (of two subexpressions), products (of two subexpressions), and integer constants; see procedural/Expr.java. Two separate collections of procedures can operate on this same collection of data structures: one to evaluate expressions (procedural/Evaluator.java) and one to convert them into Scheme notation (procedural/Converter.java). A test program, procedural/Test.java, shows how an AST could be built up and then both converted and valuated.
In the procedural style, the dispatching methods such as
evaluate
need to explicitly test which kind of Expr
is
being operated on. In Java, this can be done using
instanceof
, as shown in the example code. In earlier
languages, such as C and Pascal, it can be accomplished by tagging
each structure with an explicit type tag. Because it is possible for
the dispatching procedure to distinguish among the various structures,
with the main Expr
type being the union of all of them,
this kind of structure is called a discriminated union.
One alternative approach would be an object-oriented style,
embodying the so-called "composite pattern". In this style, the AST
classes (shown in composite/Expr.java) directly embody
evaluate
and convert
methods. The test
program, composite/Test.java,
invokes those methods.
I've taken the opportunity to clean up how the AST is constructed: instead of creating the structures "empty" and then assigning values to the instance variables, the structures are constructed in a meaningful state using constructor procedures. (The instance variables can also now be private.) This change could have been made on its own, without fundamentally deviating from the procedural style. (If the instance variables were private, accessor methods would need to be provided for use by the external evaluation and conversion procedures.)
The more fundamental change is the switch
from external procedures acting on the structures to methods within
the objects. Note in particular that the chains of if
s
with instanceof
tests are gone, and that Java's static
type checking now ensures that there is a way to evaluate and convert
each kind of expression: the code to generate runtime error messages
saying "Unknown kind of Expr" is gone.
This design is a very suitable one when the number of operations
(such as evaluation and conversion) remains small and fixed, whereas
the number of kinds of data (kinds of expressions) is large and
subject to growth. Unfortunately, that doesn't characterize
programming language processing very well. The abstract syntax of a
programming language is generally quite stable, whereas new analysis,
optimization, and translation procedures can be invented more
readily. So as not to have to keep adding new methods to all the
classes (analogous to the evaluate
and
convert
methods), it would be nice if we could group all
the evaluation methods together in one separate Evaluator
class, as in the original procedural approach. Similarly, we would
group all the conversion methods together in a separate
Converter
class, and likewise for any other operations we
wanted to add. Yet we still want to retain the advantages of the
object-oriented approach. This leads to the so-called "visitor"
pattern, which we can examine next.
We start by generalizing from the notion of an evaluator or a
converter to a visitor, defined with the interface visitor/Visitor.java. Notice that
this is a generic interface, where the type parameterization is used to
allow different visitors to return different types of results as they
visit each AST node in turn. (An
evaluator returns integers whereas a converter returns strings.) The
two specific visitors can be defined as classes implementing
that interface, namely visitor/Evaluator.java and visitor/Converter.java. The AST
classes in visitor/Expr.java are freed
from any knowledge of these specific visitors (or any others that
might be added); instead, they just have a general method to
accept
any visitor. This method is used (among other
places) in the main visitor/Test.java