Fundamentals of Computer Science I: Media Computing (CS151.02 2007F)

Recursion Basics

This reading is also available in PDF.

Summary: In many algorithms, you want to do things again and again and again. For example, you might want to do something with each value in a list. In general, the term used for doing things again and again is called repetition. While you have learned a few techniques for repeating code, these techniques only work with specific data structure. Hence, you should also know a more general technique, that will work with any instance in which you need to repeat code. In Scheme, the primary technique used for repetition is called recursion, and involves having procedures call themselves.

Contents:

Introduction

In your work so far, you've seen two basic types that ground repeated code: You can do something for each position in an image, or you can do something for each element of a list. You've also seen two general strategies for what you do at each step: You can either update something (such as an image), as is the case for image.map!, region.compute-pixels!, or foreach!. Alternately, you can build a new compound value by applying a function to each value in the original compound value, as in the case of image.map or map.

But what if you want to do other kinds of repetition? For example, what if you want to do something for every integer between 1 and 100, of if you want to do something for every third element of a list? What if you want to compute a list that contains only some elements of an original list? What if you want to compute a value that is based on all of the elements of an original list, such as the average of a list? Since it is not possible to predict every way that a programmer will want to repeat code, most languages provide a general mechanism for repeating instructions. In Scheme, the most general mechanism for repeating instructions is called recursion, and involves having a procedure call itself.

Rethinking Procedure Calls

But what does it mean to have a procedure call itself?

As we've already seen, it is commonplace for the body of a procedure to include calls to another procedure, or even to several others. For example, we might write instructions to find the average component as

(define average-component
  (lambda (color)
    (/ (+ (rgb.red color) (rgb.green color) (rgb.blue color))
       3)))

Here, there are calls to division, addition, and the three rgb component procedures in the definition of average-component.

Direct recursion is the special case of this construction in which the body of a procedure includes one or more calls to the very same procedure -- calls that deal with simpler or smaller arguments.

An Example: Summation

For instance, let's define a procedure called sum that takes one argument, a list of numbers, and returns the result of adding all of the elements of the list together:

> (sum (list 91 85 96 82 89))
443
> (sum (list -17 17 12 -4))
8
> (sum (list 9.3))
9.3
> (sum null)
0

Because the list to which we apply sum may have any number of elements, we can't just pick out the numbers using list-ref and add them up -- there's no way to know in general whether an element even exists at the position specified by the second argument to list-ref. One thing we do know about lists, however, is that every list is either (a) empty, or (b) composed of a first element and a list of the rest of the elements, which we can obtain with the car and cdr procedures.

Moreover, we can use the predicate null? to distinguish between the (a) and (b) cases, and conditional evaluation to make sure that only the expression for the appropriate case is chosen. So the structure of our definition is going to look something like this:

(define sum
  (lambda (numbers)
    (if (null? numbers)
        ; The sum of an empty list
        ; The sum of a non-empty list
    )))

The sum of the empty list is easy -- since there's nothing to add, the total is 0.

And we know that in computing the sum of a non-empty list, we can use (car numbers), which is the first element, and (cdr numbers), which is the rest of the list. So the problem is to find the sum of a non-empty list, given the first element and the rest of the list. Well, the rest of the list is one of those simpler or smaller arguments mentioned above. Since Scheme supports direct recursion, we can invoke the sum procedure within its own definition to compute the sum of the elements of the rest of a non-empty list. Add the first element to this sum, and we're done! We'd express that portion as follows.

        (+ (car numbers) (sum (cdr numbers)))

As we put it together into a complete procedure, we'll also add some documentation.

;;; Procedure:
;;;   sum
;;; Parameters:
;;;   numbers, a list of numbers.
;;; Purpose:
;;;   Find the sum of the elements of a given list of numbers
;;; Produces:
;;;   total, a number.
;;; Preconditions:
;;;   All the elements of numbers must be numbers.
;;; Postcondition:
;;;   total is the result of adding together all of the elements of numbers.
;;;   If all the values in numbers are exact, total is exact.
;;;   If any values in numbers are inexact, total is inexact.
(define sum
  (lambda (numbers)
    (if (null? numbers)
        0
        (+ (car numbers) (sum (cdr numbers))))))

At first, this may look strange or magical, like a circular definition: If Scheme has to know the meaning of sum before it can process the definition of sum, how does it ever get started?

The answer is what Scheme learns from a procedure definition is not so much the meaning of a word as the algorithm, the step-by-step method, for solving a problem. Sometimes, in order to solve a problem, you have to solve another, somewhat simpler problem of the same sort. There's no difficulty here as long as you can eventually reduce the problem to one that you can solve directly.

Another way to think about it is in terms of the way we normally write instructions. We often say go back to the beginning and do the steps again. Given that we've named the steps in the algorithm, the recursive call is, in one sense, a way to tell the computer to go back to the beginning.

Watching Sum Work

The strategy of repeatedly solving simpler problems is how Scheme proceeds when it deals with a call to a recursive procedure -- say, (sum (cons 38 (cons 12 (cons 83 null)))). First, it checks to find out whether the list it is given is empty. In this case, it isn't. So we need to determine the result of adding together the value of (car ls), which in this case is 38, and the sum of the elements of (cdr ls) -- the rest of the given list.

The rest of the list at this point is the value of (cons 12 (cons 83 null)). How do we compute its sum? We call the sum procedure again. This list of two elements isn't empty either, so again we wind up in the alternate of the if-expression. This time we want to add 12, the first element, to the sum of the rest of the list. By rest of the list, this time, we mean the value of (cons 83 null) -- a one-element list.

To compute the sum of this one-element list, we again invoke the sum procedure. A one-element list still isn't empty, so we head once more into the alternate of the if-expression, adding the car, 83, to the sum of the elements of the cdr, null. The rest of the list this time around is empty, so when we invoke sum yet one more time, to determine the sum of this empty list, the test in the if-expression succeeds and the consequent, rather than the alternate, is selected. The sum of null is 0.

We now have to work our way back out of all the procedure calls that have been waiting for arguments to be computed. The sum of the one-element list, you'll recall, is 83 plus the sum of null, that is, 83 + 0, or just 83. The sum of the two-element list is 12 plus the sum of the (cons 83 null), that is, 12 + 83, or 95. Finally, the sum of the original three-element list is 38 plus the sum of (cons 12 (cons 83 null)) that is, 38 + 95, or 133.

Here's a summary of the steps in the evaluation process.

   (sum (cons 38 (cons 12 (cons 83 null)))) 
=> (+ 38 (sum (cons 12 (cons 83 null)))))
=> (+ 38 (+ 12 (sum (cons 83 null))))
=> (+ 38 (+ 12 (+ 83 (sum null))))
=> (+ 38 (+ 12 (+ 83 0)))
=> (+ 38 (+ 12 83))
=> (+ 38 95)
=> 133

Talk about delayed gratification! That's a while to wait before we can do the first addition.

The process is exactly the same, by the way, regardless of whether we construct the three-element list using cons, as in the example above, or as (list 38 12 83) or '(38 12 83). Since we get the same list in each case, sum takes it apart in exactly the same way no matter what mechanism was used to build it.

Base Cases

The method of recursion works in this case because each time we invoke the sum procedure, we give it a list that is a little shorter and so a little easier to deal with, and eventually we reach the base case of the recursion -- the empty list -- for which the answer can be computed immediately.

If, instead, the problem became harder or more complicated on each recursive invocation, or if it were impossible ever to reach the base case, we'd have a runaway recursion -- a programming error that shows up in DrScheme not as a diagnostic message printed in red, but as an endless wait for a result. You'll need to restart DrFu to recover from infinite recursion. The good news is that you can hit the Stop button to stop and save yoru code, so you can easily reload it after you restart.

As you may have noted, there are three basic parts to these kinds of recursive functions.

You'll come back to these three parts for each recursive function you write.

Filtering Lists

Often the computation for a non-empty list involves making another test. Suppose, for instance, that we want to define a procedure that takes a list of colors and filters out the dark ones. Let's assume that we've already defined a predicate, rgb.dark?, that determines whether or not a color is dark. For most definitions of rgb.dark?, if we started with the list of colors red, yellow, dark grey, green, blue, black, and white, we would end up with red, yellow, green, and white. We can use direct recursion to develop such a procedure.

Translating this algorithm into Scheme yields the following definition:

(define rgb.filter-out-dark
  (lambda (colors)
    (if (null? colors)
        null
        (if (rgb.dark? (car colors))
            (rgb.filter-out-dark (cdr colors))
            (cons (car colors) (rgb.filter-out-dark (cdr colors)))))))

Of course, when you see nested if expressions, you may instead prefer to use a cond. We can express the same idea as follows:

(define rgb.filter-out-dark
  (lambda (colors)
    (cond
      ((null? colors) null)
      ((rgb.dark? (car colors)) (rgb.filter-out-dark (cdr colors)))
      (else (cons (car colors) (rgb.filter-out-dark (cdr colors)))))))

 

History

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Mon Dec 3 09:54:16 2007.
The source to the document was last modified on Fri Sep 28 10:31:57 2007.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS151/2007F/Readings/recursion-basics-reading.html.

You may wish to validate this document's HTML ; Valid CSS! ; Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright © 2007 Janet Davis, Matthew Kluber, and Samuel A. Rebelsky. (Selected materials copyright by John David Stone and Henry Walker and used by permission.) This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.