One possibility is that a variable always names a single cell, which sometimes contains a reference to an array object (sequence of additional cells), rather than a number. See the following figure.
This seems simple and uniform, because variables always name the same thing: a cell. In this indirect model, we can change what array object is stored in a variable's cell, just as we can change what number is stored in a variable's cell. We can also change what is stored in the array object's constituent cells, unlike numbers, which we can't change. Thus, although uniformity brings one form of simplicity, we pay with complexity of a different kind: there are two kinds of array change.
Indirect model: x
is a variable containing 5 anda
is a variable containing a reference to an array object containing 3 and 1. Note that when a cell (i.e., a variable or array element) contains a number, I've shown an arrow referring to the number for consistency, while EOPL puts the number in the cell for conciseness. This is an irrelevant distinction, because it doesn't matter whether two cells refer to the same number 5 or each have their own 5: one 5 is the same as any other.
Changing which array a variable refers to and changing the contents of
the array are conceptually quite different, but some times they can
seem confusingly similar. Suppose we observe that the variable
a
refers to a two-element array containing 3 and 1. A
little later, we observe that a
refers to a two-element
array containing 2 and 7. What happened? There are two
possibilities. One is that a
still refers to the same
array object as before, but the array's cells have had 2 and 7 stored
into them. The other possibility is that a
now refers to
a different array object, which might have contained 2 and 7 the whole
time.
Notice also that with the indirect model we can quite easily have
anonymous arrays. (Consider the Scheme expression
(vector-length (make-vector 5))
, which evaluates to 5
without ever naming the vector.) Although anonymous arrays may fit
naturally into this model, some people find them confusing, providing
a secondary reason to consider an alternative model.
In order to avoid having two changeable layers (with the need to keep straight which is changed), we can instead use a direct model, where a variable is always the name for some collection of cells, but that collection can either be of size 1 (a "scalar variable," capable of holding a single number) or of positive size (an "array variable," capable of holding a sequence of numbers). See the following figure.
Note that it is tempting in this model to consider scalar variables and one-element array variables to be the same thing, which wouldn't make sense in the indirect model. Some languages succumb to this temptation and others don't.
Direct model: x
is a scalar variable containing 5 anda
is an array variable containing 3 and 1. Note thata
is a name for the entire array, even though it happens to be positioned over the first cell of the array. In the notes for Section 6.2 we will need to name individual cells within arrays, and will introduce a separate notation for doing so.
Also, notice that this direct model, using array variables rather than
array objects, does not naturally accommodate anonymous arrays: An
array comes into existence as part of a variable declaration. This
explains why EOPL's language contains the
letarray
(and definearray
) construct, rather
than something analogous to Scheme's make-vector
. For
consistency, they use this approach even with the indirect model,
where letarray a[2] in
body can be thought of
as essentially an abbreviation for let a = makearray(2) in
body. With the direct model, letarray
is more fundamental, since the variable a
is the
array, rather than just containing a reference to the array.
Notice that an assignment like x := y
always
means to copy what is in the variable called y
into the
variable called x
. In the indirect model, the variables
called y
and x
are always single cells, and
so the assignment always means copying the content of a single cell
into another single cell. The cell content that is copied may be a
reference to an array object, in which case x
comes to
refer to the same array object as y
, as shown below:
Indirect array assignment
With the direct model, x
and y
may be array
variables rather than scalar variables - they may be sequences of
several cells. The assignment x := y
still
means to copy what is in the variable y
into the variable
x
, but this now means copying the contents of each of the
group of cells named y
into the corresponding cell in the
group named x
, as shown below:
The difference isn't what assignment means so much as what a variable is; in both cases we copied
Direct array assignment
y
's contents into
x
Both array models show up in real programming languages. Scheme and Java both use the indirect model. Pascal uses the direct model, but has an explicit "pointer" data type that can be use to simulate the indirect model, at the expense of extra notational machinery. C and C++ also use the direct model and have explicit pointers that can be used to simulate the indirect model. However, they also provide automatic conversions between arrays and pointers, which allow much of the extra notational machinery to be avoided, making it look as though you were using an indirect array model, much of the time. Also, C and C++ do not allow assignment of direct arrays, only of the pointers, though other kinds of aggregate variables can be assigned, which results in copying of each element.
With this one simple statement, and the preceding material on the two array models and what assignment means, you should be able to figure out what call-by-value means for each of the two array models. In particular, Figure 6.16 (page 185) can be deduced by considering the following analogue of Figure 6.15:
letarray u[3]; v[2] in begin u[0] := 5; u[1] := 6; u[2] := 4; v[0] := 3; v[1] := 8; letarray x[3] in begin x := u; x[1] := 7; x := v; x[1] := 9 end endNote that the assignment
x := v
is a bit odd
(even illegal in some languages) because of the two different size
arrays involved. (EOPL's version of the direct model
calls for copying v
's elements into the corresponding
part of x
, leaving the rest of x
alone.)
The essential aspects of this example would be unchanged, however, if
we extended v
to a three-element array so as to avoid
this issue. And in any case, the above code, which involves no
procedure calling at all, is completely equivalent to what happens when
Figure 6.15 is evaluated using call-by-value. This is true with
either the indirect model (variables containing references to array
objects) or the direct model (array variables).
Call-by-value is used in many real programming languages, including Scheme and Java. Pascal and Ada allow the programmer to specify call-by-value, as well as other options. (The parameter passing method can be individually chosen for each parameter.) C uses call-by-value, and so does C++ by default, but the same notes apply here as in the previous subsection. Since direct assignment of arrays is not permitted in C and C++, arrays can't actually be passed as arguments. However, there are automatic conversions to pointers, so without doing anything extra, you can get an effect much like indirect-model call-by-value, even though what is actually happening is the passing (by value) of an explicit pointer to a direct-model array. Also, non-array aggregate variables can directly be passed by value, copying each element.
from-to-do
procedure
from Concrete Abstractions. I've got a definition of it
linked here.
I prefer to divide the Array ADT into two layers: a core layer, providing the basic functionality in terms of some underlying representation, and a layer of "extra" procedures, written in terms of the core layer and providing only additional convenience.
Recall that an array is collection of cells. I consider the Array ADT's core layer to consist of the following four procedures:
(make-array
length)
(array?
object)
(array-length
array)
(array-cell
array
i)
I have provided two different implementations of this core layer, making different design choices:
vector-set!
is never used in this representation: the
vector is serving just as an immutable "glue" to hold the cells
together, with the only potential for mutation being within each
cell.
vector-set!
, and hence essentially act like cells. Thus,
we can avoid having individual cells separate from the vector, by
switching to a new representation of the Cell ADT, where a cell is a
position within a vector.
Using the four core procedures, we can write the "extras":
(define array-ref (lambda (array index) (cell-ref (array-cell array index)))) (define array-set! (lambda (array index value) (cell-set! (array-cell array index) value))) (define array-whole-set! (lambda (dest-array source-array) (let ((dest-len (array-length dest-array)) (source-len (array-length source-array))) (if (> source-len dest-len) (error "Array too long for assignment:" source-array) (from-to-do 0 (- source-len 1) (lambda (i) (array-set! dest-array i (array-ref source-array i)))))))) (define array-copy (lambda (array) (let ((new-array (make-array (array-length array)))) (array-whole-set! new-array array) new-array)))These provide the same functionality as the like-named procedures in EOPL, but do so in a cleaner way, since they are defined in terms of the four abstract interface procedures of the core Array ADT, rather than in terms of a particular representation. Writing directly in terms of the representation is an optimization, one that is possible with my two representations as well. However, it seems out of character to emphasize optimization of low-level details over clarity in EOPL.
All the above code (core and extras) replaces Figure 6.1.1 on
page 181. Figures 6.1.2 and 6.1.3 can remain unchanged.
Figure 6.1.4 on page 184 should be altered by fixing one bug in
denoted-value-assign!
: it shouldn't be possible to store
an array into a scalar variable:
(define denoted-value-assign! (lambda (den-val exp-val) (cond ((and (not (array? den-val)) (not (array? exp-val))) (cell-set! den-val exp-val)) ((and (array? den-val) (array? exp-val)) (array-whole-set! den-val exp-val)) (else (error "Incompatible assignment:" den-val exp-val)))))One final change is at the bottom of page 184. This is a version of
array-set!
that prevents an array
from being stored into one of the elements of an array:
(define array-set! (lambda (array index value) (if (array? value) (error "Cannot assign array to array element:" value) (cell-set! (array-cell array index) value))))
Instructor: Max Hailperin