Association lists

Consider the organization of a simple telephone directory for on-campus telephones: a sequence of entries, each consisting of a name and a four-digit telephone number. In Scheme, it's natural to use strings for names; it turns out that telephone numbers should also be represented as strings, since string operations make a useful kind of sense when applied to telephone numbers and integer operations do not. (For instance, (string-append "269-" extension) does something useful if the value of extension is a string, but not if it is an integer.)

To represent each individual entry in a telephone directory, we can use a pair, such as ("Henry Walker" . "4208") or ("John Stone" . "3181"), with the name as the car of the entry and the telephone number as the cdr. An entire directory, then, would be a list of such entries:

(define science-chairs-directory
  (list (cons "Bruce Voyles" "3038")
        (cons "Diane Robertson" "3039")
        (cons "Martin Minelli" "3007")
        (cons "Arnold Adelberg" "4201")
        (cons "Mark Schneider" "3018")
        (cons "Janet Gibson" "3168")))

In Scheme, a list of pairs is called an association list or alist.

As the telephone-directory example illustrates, a particularly common application of association lists involves looking for a desired name or first component of a pair and retrieving the second component of a pair. Thus, the first component of each pair (the car of a pair) often is called a key, and the cdr of the pair is its associated data or value. For example, in the above illustration, "Martin Minelli", "Arnold Adelberg", and "Janet Gibson" are some of the keys, and the telephone numbers are the associated data. Thus an association list is a simple way to implement a small database.

Since such applications are very common, Scheme provides procedures to retrieve from an association list the pair containing a specified key. The most frequently used procedure of this kind is assoc. Given a key and association list, assoc returns the first pair with the given key. If the key does not occur in the association list, then assoc returns #f. For example, the value of (assoc "Mark Schneider" science-chairs-directory) is ("Mark Schneider" . "3018"), while the value of (assoc "Laurel Smith" science-chairs-directory) is #f.

To find the telephone number corresponding to a given name, we could apply the cdr procedure to the result of assoc:

(define look-up-telephone-number
  (lambda (name)
    (if (assoc name science-chairs-directory)
        (cdr (assoc name science-chairs-directory))
        'unlisted)))

The value of the call (look-up-telephone-number "Mark Schneider") is "3018" and the value of (look-up-telephone-number "Laurel Smith") is the symbol unlisted.


Exercise 1

Define an association list birth-dates that associates the surnames of recent presidents of the United States (as strings) with their birth-dates (again, as strings).

Note: The value of birth-dates is not a procedure, so it is not necessary to use a lambda-expression in this exercise. Look at the definition of science-chairs-directory for an example of the form that your definition of birth-dates should take.

Here's a table containing information for your association list:

President Date of birth
Clinton August 19, 1946
Bush June 12, 1924
Reagan February 6, 1911
Carter October 1, 1924
Ford July 14, 1913
Nixon January 9, 1913
Johnson August 27, 1908
Kennedy May 29, 1917
Eisenhower October 14, 1890

Exercise 2

Use the assoc procedure to search the birth-dates association list for someone who is on the list and for someone who is not on the list.


Exercise 3

Redefine birth-dates so that it includes two entries with the same key, for two people who have the same surname -- say, John Adams (born October 30, 1735) and John Quincy Adams (born July 11, 1767). What happens if you try to apply assoc to retrieve these entries, using the common key "Adams"?

If you find the results disappointing, define and test a procedure similar to assoc, except that it returns a list of all the pairs with the given key.


Exercise 4

What happens if you search by date instead of by person? (For example, you might try (assoc "October 1, 1924" birth-dates).)

If you find the results disappointing, define and test a procedure reverse-lookup that takes two arguments, an association list alist and an associated datum val, and returns a pair from alist that has val as its second component, or #f if there is no such pair.


The assoc procedure is actually one of three related built-in procedures in Scheme; the other two are assq and assv. Each of these procedures scan association lists for keys. They differ only in the test used for determining when a key is found:

(See the earlier lab ``Procedure definitions'' to refresh your memory of these predicates.)


Exercise 5

Define and test a procedure that takes two arguments, the first a positive integer sought and the second an association list als in which all of the keys are natural numbers, and returns the first pair in als in which the car is evenly divisible by sought.

(In other words, this procedure works like assoc, except that you can recover a pair from the association list if you know any divisor of its key, without having to know the key itself.)


This document is available on the World Wide Web as

http://www.cs.grinnell.edu/~stone/courses/scheme/association-lists.xhtml

created February 11, 2000
last revised March 17, 2000

Henry Walker (walker@cs.grinnell.edu) and John David Stone (stone@cs.grinnell.edu)