Recursion with lists

As we've already seen, it is commonplace for the body of a procedure to include calls to another procedure, or even to several others. Direct recursion is the special case of this construction in which the body of a procedure includes one or more calls to the very same procedure -- calls that deal with simpler or smaller arguments.

For instance, let's define a procedure called sum that takes one argument, a list of numbers, and returns the result of adding all of the elements of the list together:

> (sum (list 91 85 96 82 89))
443
> (sum (list -17 17 12 -4))
8
> (sum (list 19/3))
19/3
> (sum null)
0

Because the list to which we apply sum may have any number of elements, we can't just pick out the numbers using list-ref and add them up -- there's no way to know in general whether an element even exists at the position specified by the second argument to list-ref. One thing we do know about lists, however, is that every list is either (a) empty, or (b) composed of a first element and a list of the rest of the elements, which we can obtain with the car and cdr procedures.

Moreover, we can use the predicate null? to distinguish between the (a) and (b) cases, and conditional evaluation to make sure that only the expression for the appropriate case is chosen. So the structure of our definition is going to look something like this:

(define sum
  (lambda (ls)
    (if (null? ls)
        --- Compute the sum of the empty list here. ---
        --- Compute the sum of a non-empty list here. --- )))

And we know that in computing the sum of a non-empty list, we can use (car ls), which is the first element, and (cdr ls), which is the rest of the list.

The sum of the empty list is easy -- since there's nothing to add, the total is 0. So the problem is to find the sum of a non-empty list, given the first element and the rest of the list. Well, the rest of the list is one of those ``simpler or smaller'' arguments that I mentioned above. Since Scheme supports direct recursion, we can invoke the sum procedure within its own definition to compute the sum of the elements of the rest of a non-empty list. Add the first element to this sum, and we're done!

;;; sum: find the sum of the elements of a given list of numbers

;;; Given:
;;;   LS, a list of numbers.

;;; Result:
;;;   TOTAL, a number.

;;; Preconditions:
;;;   None.

;;; Postcondition:
;;;   TOTAL is the result of adding together all of the elements of LS.

(define sum
  (lambda (ls)
    (if (null? ls)
        0
        (+ (car ls) (sum (cdr ls))))))

At first, this may look strange or magical, like a circular definition: If Scheme has to know the meaning of sum before it can process the definition of sum, how does it ever get started?

The answer is what Scheme learns from a procedure definition is not so much the meaning of a word as the algorithm, the step-by-step method, for solving a problem. Sometimes, in order to solve a problem, you have to solve another, somewhat simpler problem of the same sort. There's no difficulty here as long as you can eventually reduce the problem to one that you can solve directly.

That's how Scheme proceeds when it deals with a call to a recursive procedure -- say, (sum (cons 38 (cons 12 (cons 83 null)))). First, it checks to find out whether the list it is given is empty. In this case, it isn't. So we need to determine the result of adding together the value of (car ls), which in this case is 38, and the sum of the elements of (cdr ls) -- the rest of the given list.

The rest of the list at this point is the value of (cons 12 (cons 83 null)). How do we compute its sum? We call the sum procedure again. This list of two elements isn't empty either, so again we wind up in the alternate of the if-expression. This time we want to add 12, the first element, to the sum of the rest of the list. By ``rest of the list,'' this time, we mean the value of (cons 83 null) -- a one-element list.

To compute the sum of this one-element list, we again invoke the sum procedure. A one-element list still isn't empty, so we head once more into the alternate of the if-expression, adding the car, 83, to the sum of the elements of the cdr, null. The ``rest of the list'' this time around is empty, so when we invoke sum yet one more time, to determine the sum of this empty list, the test in the if-expression succeeds and the consequent, rather than the alternate, is selected. The sum of null is 0.

We now have to work our way back out of all the procedure calls that have been waiting for arguments to be computed. The sum of the one-element list, you'll recall, is 83 plus the sum of null, that is, 83 + 0, or just 83. The sum of the two-element list is 12 plus the sum of the (cons 83 null), that is, 12 + 83, or 95. Finally, the sum of the original three-element list is 38 plus the sum of (cons 12 (cons 83 null)) that is, 38 + 95, or 133.

Here's a summary of the steps in the evaluation process.

    (sum (cons 38 (cons 12 (cons 83 null)))) 
--> (+ 38 (sum (cons 12 (cons 83 null)))))
--> (+ 38 (+ 12 (sum (cons 83 null))))
--> (+ 38 (+ 12 (+ 83 (sum null))))
--> (+ 38 (+ 12 (+ 83 0)))
--> (+ 38 (+ 12 83))
--> (+ 38 95)
--> 133

The process is exactly the same, by the way, regardless of whether we construct the three-element list using cons, as in the example above, or as (list 38 12 83) or '(38 12 83). Since we get the same list in each case, sum takes it apart in exactly the same way no matter what mechanism was used to build it.

The method of recursion works in this case because each time we invoke the sum procedure, we give it a list that is a little shorter and so a little easier to deal with, and eventually we reach the base case of the recursion -- the empty list -- for which the answer can be computed immediately.

If, instead, the problem became harder or more complicated on each recursive invocation, or if it were impossible ever to reach the base case, we'd have a runaway recursion -- a programming error that shows up in DrScheme not as a diagnostic message printed in red, but as an endless wait for a result. The designers of DrScheme's interface provided a Break button above the definition window so that you can interrupt a runaway recursion: Move the mouse pointer onto it and click the left mouse button, and DrScheme will abandon its attempt to evaluate the expression it's working on.


Exercise 1

Define and test a Scheme procedure product that takes a list of numbers as its argument and returns the result of multiplying them all together. Warning: (product null) should not be 0. It should be the identity for multiplication, just as (sum null) is the identity for addition. Explain why.


Exercise 2

Define and test a Scheme procedure square-each-element that takes a list of numbers as its argument and returns a list of their squares.

> (square-each-element (list -7 3 12 0 4/5))
(49 9 144 0 16/25)

Hint: For the base case, consider what the procedure should return when given a null list; for the other case, separate the car and the cdr of the given list and consider how to operate on them so as to construct the desired result.


Exercise 3

Define and test a Scheme procedure lengths that takes a list of lists as its argument and returns a list of their lengths:

> (lengths (list (list 'alpha 'beta 'gamma)
                 (list 'delta)
                 null
                 (list 'epsilon 'zeta 'eta 'theta 'iota 'kappa)))
(3 1 0 6)

Often the computation for a non-empty list involves making another test. Suppose, for instance, that we want to define a procedure that takes a list of integers and ``filters out'' the negative ones, so that if, for instance, we give it a list consisting of -13, 63, -1, 0, 4, and -78, it returns a list consisting of 63, 0, and 4. We can use direct recursion to develop such a procedure:

Translating this algorithm into Scheme yields the following definition:

(define filter-out-negatives
  (lambda (ls)
    (if (null? ls)
        null
        (if (negative? (car ls))
            (filter-out-negatives (cdr ls))
            (cons (car ls) (filter-out-negatives (cdr ls)))))))

Exercise 4

Define and test a Scheme procedure named filter-out-skips that takes a list of symbols as its argument and returns a list that does not contain the symbol skip, but is otherwise identical to the given list. (Use the predicate eq? to test whether two symbols are alike.)

> (filter-out-skips (list 'hop 'skip 'jump 'skip 'and 'skip 'again))
(hop jump and again)

The example illustrates the intended effect of the procedure. By itself, however, it's not an adequate test of your procedure. It would be a good idea to test the case in which the given list is empty, a case it which it contains only skips, and one in which it contains only symbols other than skip.

I recommend that you test the procedures you create very thoroughly. In most cases, testing does not reveal any errors in your procedures; but finding and correcting the errors that testing exposes is one of the most productive and rewarding uses of a programmer's time.


Exercise 5

Define and test a Scheme procedure named tally-occurrences that takes two arguments, a symbol and a list of symbols, and determines how many times the given symbol occurs in the given list.

Hint: Use direct recursion. Here are the questions that you must resolve: What is the base case? What value should the procedure return in that case? How can you simplify the problem in order to recursively invoke the procedure being defined? What do you need to do with the value of the recursive procedure call in order to obtain the final result?

> (tally-occurrences 'apple (list 'pear 'apple 'cranberry 'banana 'apple))
2
> (tally-occurrences 'apple (list 'oak 'elm 'maple 'spruce 'pine))
0

Sometimes the problem that we need an algorithm for doesn't apply to the empty list, even in a vacuous or trivial way, and the base case for a direct recursion instead involves singleton lists -- that is, lists with only one element. For instance, suppose that we want an algorithm that finds the greatest element of a given non-empty list of real numbers.

> (greatest-of-list (list -17 38 62/3 -14/9 204/5 26 19))
204/5

The assumption that the list is not empty is a precondition for the meaningful use of this procedure, just as a call to Scheme's built-in quotient procedure requires that the second argument, the divisor, be non-zero. You should form the habit of noting and detailing such preconditions as you write the initial comment for a procedure:

;;; greatest-of-list: find the greatest element of a given list of real
;;; numbers

;;; Given:
;;;   LS, a list of real numbers.

;;; Result:
;;;   GREATEST, a real number.

;;; Precondition:
;;;   LS is not empty.

;;; Postconditions:
;;;   (1) GREATEST is an element of LS.
;;;   (2) GREATEST is greater than or equal to every element of LS.

If a list of real numbers is a singleton, the answer is trivial -- its only element is its greatest element. Otherwise, we can take the list apart into its car and its cdr, invoke the procedure recursively to find the greatest element of the cdr, and use Scheme's built-in procedure max to compare the car to the greatest element of the cdr, returning whichever is greater.

We can test whether the given list is a singleton by checking whether its cdr is an empty list. The value of the expression (null? (cdr ls)) is #t if ls is a singleton, #f if ls has two or more elements.

Here, then, is the procedure definition:

(define greatest-of-list
  (lambda (ls)
    (if (null? (cdr ls))
        (car ls)
        (max (car ls) (greatest-of-list (cdr ls))))))

If someone who uses this procedure happens to violate its precondition, applying the procedure to the empty list, DrScheme notices the error and prints out a diagnostic message:

CDR: expects argument of type <pair>; given ()


Exercise 6

Using the disparity procedure defined in the lab on conditional evaluation, define and test a Scheme procedure named gaps that takes a non-empty list of real numbers as its argument and returns a list of the disparities between numbers that are adjacent on the given list.

> (gaps (list 30 16 21 9 42))
(14 5 12 33)

Note that gaps always returns a list one element shorter than the one it is given.


When we define a predicate that uses direct recursion on a given list, the definition is usually a little simpler if we use and- and or-expressions rather than if-expressions. For instance, consider a predicate all-even? that takes a given list of integers and determines whether all of them are even. As usual, we consider the cases of the empty list and non-empty lists separately:

Thus all-even? should return #t when the given list either is empty or has an even first element and all even elements after that. This yields the following definition:

;;; all-even?: determine whether all of the elements of a list of
;;; integers are even

;;; Given:
;;;   LS, a list of integers.

;;; Result:
;;;   RESULT, a Boolean.

;;; Preconditions:
;;;   None.

;;; Postconditions:
;;;   RESULT is #T if all of the elements of LS are even, #F if any
;;;   of them is not even.

(define all-even?
  (lambda (ls)
    (or (null? ls)
        (and (even? (car ls))
             (all-even? (cdr ls))))))

When ls is the empty list, all-even? applies the first test in the or-expression, finds that it succeeds, and stops, returning #t. In any other case, the first test fails, so all-even? proceeds to evaluate the first test in the and-expression. If the first element of ls is odd, the test fails, so all-even? stops, returning #f. However, if the first element of ls is even, the test succeeds, so all-even? goes on to the recursive procedure call, which checks whether all of the remaining elements are even, and returns the result of this recursive call, however it turns out.


Exercise 7

Define and test a Scheme predicate all-in-range? that takes a list as argument and determines whether all of its elements are in the range from 0 to 100, inclusive.


Exercise 8

Define and test a Scheme predicate element? that takes two arguments, a symbol and a list, and determines whether the given symbol is an element of the given list.


Optional exercises for fast workers

The following exercises illustrate some other variations on direct list recursion. Since these exercises are optional, it's unlikely that I'll use class time to discuss them. I have therefore provided an on-line solution to each one, so that you can check your work or see how the trick is done.


Exercise 9

Define and test a Scheme procedure shuffle that takes two lists as arguments and returns a list that results from ``shuffling'' the given lists together, like two halves of a deck of cards: The first element of the result list should come from the first given list, the second from the second, the third from the first list again, the fourth from the second, and so on. Once the shorter of the given lists is exhausted, all the rest of the elements of the result list should come from the other list.

> (shuffle (list 'a 'b 'c 'd 'e) (list 'x 'y 'z))
(a x b y c z d e)

Hint: Separate off two ``base cases,'' one for each list.

Solution


Exercise 10

Define and test a Scheme procedure add-pairs that takes a list of numbers of even length as its argument and returns a list of the results of adding the elements of the given list, two at a time.

> (add-pairs (list 7 3 6 8 4 2 1 9))
(10 14 6 10)

Solution


Exercise 11

Define and test a Scheme procedure unshuffle that takes a list as argument and returns a list of two lists, one comprising the elements in even-numbered positions in the given list, the other comprising the elements in odd-numbered-positions.

> (unshuffle (list 'a 'b 'c 'd 'e 'f 'g 'h 'i))
((a c e g i) (b d f h))

Hint: Define a separate ``helper'' procedure that takes the car of the given list and the result of the recursive call as its arguments and rearranges the pieces as necessary to obtain the final result.

Solution


This document is available on the World Wide Web as

http://www.cs.grinnell.edu/~stone/courses/scheme/recursion-with-lists.html

created September 2, 1997
last revised March 17, 2000

John David Stone (stone@cs.grinnell.edu)