Side effects

Course links

Vectors as mutable structures

A vector is a mutable data structure: It is possible to reach into a vector and replace one of its elements with a different value, just as one can take out the contents of a container and put in something else instead. It's still the same vector after the replacement, just as the container retains its identity no matter how often its contents are changed.

The particular values that a vector contains at some particular moment constitute its state. One could summarize the preceding paragraph by saying that the state of a vector can change and that state changes do not affect the underlying identity of the vector.

Unlike any of the operations that we have seen up to this point, replacing an element inside a vector is a destructive operation: The value that is replaced is gone for good, unless we have kept an extra copy outside of the vector, so the usual practice is to carry out the replacement only when the information contained in the displaced element is invalid, unwanted, or duplicated elsewhere. Procedures that have the potential to replace constituents in data structures are called mutators. In Scheme, it is conventional to give them names ending in exclamation points, as a visible reminder of the possibility of losing data irretrievably when one applies them. In effect, the exclamation point means ``Proceed with caution!''

For instance, the procedure that replaces one element of a vector is called vector-set!. It takes three arguments -- a vector vec, a natural number k (which must be less than the length of vec), and a Scheme value obj -- and replaces the element of vec that is currently in the position indicated by k with obj.

The Revised5 report on the algorithmic language Scheme does not specify what value vector-set! should return, and different implementations of Scheme decide the matter differently. DrScheme returns a special ``void'' value, which it usually does not even bother to print in response to a command. (It is possible to trick DrScheme into printing it by storing it in a data structure; in that case, it appears as #<void>.) The idea is that no Scheme programmer should rely on any particular return value; you should write all of your programs in such a way that it makes no difference what value vector-set! returns.

This is another general convention for mutators: One invokes them for their side effects on the states of the data structures they modify, not because they compute and return useful values.

To demonstrate the operation of vector-set!, then, we have to create a vector, apply the procedure to it, and then inspect its contents again, noticing what has changed:

> (define sample-vector (vector alpha beta gamma delta epsilon))
> (vector-set! sample-vector 2 'zeta)
> sample-vector  ; same vector, now with changed contents
#(alpha beta zeta delta epsilon)
> (vector-set! sample-vector 0 "foo")
> sample-vector  ; changed contents again
#("foo" beta zeta delta epsilon)
> (vector-set! sample-vector 2 -38.72)
> sample-vector  ; and again
#("foo" beta -38.72 delta epsilon)

In theory, vectors introduced into a Scheme program as literal constants, by means of the mesh-and-parentheses notation, are ``immutable'' -- applying vector-set! (or any other mutator) to such a vector is an error, and the contents of such vectors cannot be modified. However, many implementations of Scheme, including DrScheme, do not enforce this rule. Here is what an attempt to change an immutable vector looks like under an implementation that does enforce it (Scheme48):

> (define new-vector '#(sipos igin andras ormis))
; no values returned
> (vector-set! new-vector 2 'dva)

Error: exception
       (vector-set! '#(sipos igin andras ormis) 2 'dva)

Another vector mutator is also available in Scheme: vector-fill!, which replaces all of the elements of a given vector. The vector-fill! procedure takes two arguments, the vector and the replacement value, and destructively changes the state of the vector, replacing each of the elements it formerly contained with the replacement value.

> (define sample-vector (vector 'rho 'sigma 'tau 'upsilon))
> (vector-fill! sample-vector 'kappa)
> sample-vector  ; same vector, now with changed contents
#(kappa kappa kappa kappa)

As a mutator, the vector-fill! procedure is invoked only for its side effect, and the value that it returns is unspecified (that is, it might be anything).

The imperative model

When we introduce mutators into our programs, it becomes more difficult to reason about them. Up to this point, we could confidently expect that, under any particular collection of bindings, repeated evaluation of the same expression would always yield the same value. For instance, if clock-vector is bound to the vector #(12 3 6 9 12 3 6 9), then the value of the expression (vector-ref clock-vector 4) is 12 no matter where that expression occurs. Now that vectors have mutable states, we have to step more carefully:

> (define clock-vector (vector 12 3 6 9 12 3 6 9))
> (vector-ref clock-vector 4)
12
> (vector-fill! clock-vector 0)
> (vector-ref clock-vector 4)
0

The programming that we've done previously uses a model of computation that is sometimes called the pure functional model: Provided that we understand all the kinds of values that there are and keep track of which identifiers are bound to which values, we know everything we need to know about the evaluation of expressions. We are now making a transition to a slightly more complicated model of programming, one in which some of our values -- vectors -- have states that we also need to keep track of. In this imperative model, the order in which expressions are evaluated becomes extremely important; as we just saw, the value of a call to vector-ref may depend on whether it is made before or after a call to a mutator.

Begin-expressions and one-armed if-expressions

The Scheme structures that we have so far encountered give us only a little control over the order in which expressions are evaluated. In particular, the arguments in a Scheme procedure call can be evaluated in any order whatever, and possibly even in different orders in different calls. The same is true of the binding expressions in a let-expression.

Although if-, cond-, and-, or-, and let*-expressions do impose some conditions on the order of evaluation of their subexpressions, in different ways, we haven't yet seen the simplest possible control structure -- one that just says, in effect, ``Do this, then do that, then do this other thing, ...''

A begin-expression causes each of its subexpressions to be evaluated once, and only once, in sequence. The value of the last subexpression is the value of the entire begin-expression; the values of the other subexpressions are discarded. Consequently, begin-expressions are useful only in connection with procedures that have side effects. (There is no point whatever in calling a procedure that has no side effects and then throwing away what it returns.)

As a first example of the use of the begin-expression, let's write our own version of the vector-fill! procedure. This might even be useful, since a few older implementations of Scheme do not support this procedure as a primitive.

;;; our-vector-fill!: destructively replace each of the elements of a given
;;; vector with a given value

;; Givens:
;;   VEC, a vector.
;;   OBJ, a value.

;; Results:
;;   None.

;; Preconditions:
;;   None.

;; Postcondition:
;;   Every element of VEC is OBJ.

(define our-vector-fill!
  (lambda (vec obj)
    (let ((size (vector-length vec)))
      (let kernel ((position 0))
        (if (< position size)
            (begin
              (vector-set! vec position obj)
              (kernel (+ position 1))))))))

In other words: Let size be the number of elements in the vector vec. Starting at position 0, put obj into each position within vec, overwriting the value previously stored in that position. Increase position by 1 after each such overwriting step. When position becomes equal to size, stop.

Notice the structure of the if-expression. When the test is true, there are two things that we want to do: First, replace the value in the current position of vec, and second, go on to deal with the remaining positions. By combining these two operations into a single begin-expression, we ensure that both of them will be performed when the test is true.

What happens if the test is false? In this case, we have nothing at all to do -- all the side effects that we set out to do are done, and there's no particular value that we want our-vector-fill! to return. As the alternate in the if-expression, we could write a dummy begin-expression with no subexpressions, thus:

(if (< position size)
    (begin
      (vector-set! vec position obj)
      (kernel (+ position 1)))
    (begin))

However, for this case, Scheme provides a slightly more elegant solution: The alternate in an if-expression can simply be omitted. If the value of the test in an if-expression with no alternate turns out to be #f, the value of the if-expression is unspecified (that is, it might be anything). Since our-vector-fill! is invoked only for its side effect, we don't care what value it returns in the base case, where position is equal to size, so we're content to return the unspecified value of the one-armed if-expression.

Generating vectors more efficiently

Some procedures can be written to use the computer's resources more efficiently once mutation is permitted. For instance, the version of the vector-generator procedure in the reading on vectors builds up a list containing the values that we eventually want to put into the vector, and then applies list->vector to convert the list to a vector. It would be faster to put the elements directly into the vector, bypassing the intermediate data structure. Here's how to do just that:

;;; vector-generator: given a unary procedure that takes a natural number
;;; as its argument, return a procedure that constructs vectors of any
;;; specified length by applying the given procedure to position numbers

;; Given:
;;   PROC, a unary procedure.

;; Result:
;;   VECTOR-MAKER, a unary procedure.

;; Precondition:
;;   PROC can be applied to any natural number and returns one value when
;;   so applied.

;; Postconditions:
;;   (1) VECTOR-MAKER, when applied to any natural number SIZE, returns a
;;       vector VEC containing SIZE elements.
;;   (2) For every natural number POS less than SIZE, the element at
;;       position POS in VEC is the result of applying PROC to POS.

(define vector-generator
  (lambda (proc)
    (lambda (size)
      (let ((vec (make-vector size)))
        (let kernel ((position 0))
          (if (= position size)
              vec
              (begin
                (vector-set! vec position (proc position))
                (kernel (+ position 1)))))))))

In English: When the procedure that vector-generator constructs is invoked, it first creates a vector of the specified size, then traverses it, pausing at each position to apply proc to the position number and to store the result at the specified position in the vector. When each position has been filled, it returns the completed vector as its value.

Although it uses mutation internally, vector-generator is not itself a mutator, since the vector that it is modifying is one that it created, not one that it received as an argument.

You may have noticed that in this procedure definition the make-vector procedure is invoked with only one argument. The second argument to make-vector is optional; if you omit it, the value that initially occupies each of the positions in the vector is left unspecified. Various implementations of Scheme fill them up in different ways, so you should omit the second argument of make-vector only when you intend to replace the contents of the vector right away, as we do in this case.

I am indebted to Professor Ben Gum for his contributions to the development of this reading.