CSC151.02 2010S Functional Problem Solving : Readings

Association Lists


Summary: We consider association lists, a common technique for storing information, as well as the assoc procedure, which is used for searching through those lists.

Introduction

Consider a simple problem that has been handled behind the scenes in MediaScript: Converting from color names, represented as strings, to the color values we are so accustomed to. While one approach would be to write a procedure with a very long cond expression, that could take a long time to write. As importantly, such a solution is not very adaptable.

Another approach is to represent the information as if it were a phone book or a dictionary, with a long list of entries. Each entry would have a color name (e.g., "white"), the red, green, and blue values (e.g., 255, 255, and 255), and perhaps even some additional information, such as the names of common palettes to which the color belongs. You might come up with something like the following:

Color     R   G   B   Additional Information
--------- --- --- --- -----------------------------
red       255   0   0 primary,rainbow,reds,web-safe
green       0 255   0 primary,rainbow,web-safe
yellow    255 255   0 secondary,rainbow
blood-red 102   0   0 reds,web-safe
bronze    140 120  83 metallic

When computer scientists represent information in this way, they tend to refer to the information as a table. Often, we call a group of tables a database. Once information is grouped into a table, we can write procedures to select information, such as all of the rainbow colors.

Representing Database Tables

A great deal of research has gone into ways to represent database tables efficiently, so that the queries we make to select elements or create new tables are quick. These structures are beyond the scope of this course. For now, we will choose a simple and understandable representation that builds upon the core Scheme structures.

In particular, to to represent each of these individual entries, we can use a list, such as ("red" 255 0 0 "primary" "rainbow" "reds" "web-safe") or ("bronze" 140 120 83 "metallic"). In each case, the name of the color is the car of the entry, the red, green, and blue components are the cadr, caddr, and cadddr of the entry, respectively. An entire color description, then, would be a list of such entries. (We're leaving out the other attributes, for now, because they may complicate things.)

;;; Value:
;;;   named-colors
;;; Type:
;;;   List of lists.
;;;   Each sublist is of length at least four and contains a name,
;;;     red, green, and blue components, and a sequence of attributes.
;;;   Each component is an integer between 0 and 255, inclusive.
;;; Contents:
;;;   A list of common colors, their components, and some attributes.
(define named-colors
  (list (list "black" 0 0 0)
        (list "blah grey" 128 128 128)
        (list "blood red" 102 0 0)
        (list "blue" 0 0 255)
        (list "green" 0 255 0)
        (list "indigo" 102 0 255)
        (list "medium grey" 128 128 128)
        (list "orange" 255 119 0)
        (list "red" 255 0 0)
        (list "violet" 79 47 79)
        (list "white" 255 255 255)
        (list "violet red" 204 50 153)
        (list "yellow" 255 255 0)))

While this is a simple representation, it is frequently used by Scheme programmers who want to describe tables. There's even a name for this arrangement: In Scheme, a list of pairs or lists in which we use the value of the car to identify an entry is called an association list or alist.

As the color directory example suggests, a common application of association lists involves looking for a desired name or first component of an entry and then returning another part of the entry (or something computed from other parts of the entry). For example, we are likely to search the color directory for color names and then build a new color based on the three components. Because the first entry has a special role, we often refer to it as the key of the entry. We refer to the rest of the entry as the associated value. In the color list above, the color names are the keys (e.g., "black", "blood red", and "blue") and the RGB components and attributes are the associated values.

assoc, Scheme's Built-in Lookup Procedure

Since applications in which we need to extract data from tables are very common, Scheme provides procedures to retrieve from an association list the entry containing a specified key. The most frequently used procedure of this kind is assoc. Given a key and association list, assoc returns the first entry with the given key. If the key does not occur in the association list, then assoc returns #f. For example,

> (assoc "white" named-colors)
("white" 255 255 255)
> (assoc "scarlet" named-colors)
#f

Extracting Information

Once we've used assoc to find an entry, what do we do next? Most typically, we then do something with the rest of the entry. To continue our color table example, we might want to convert the three components to an RGB color.

So, what do we do? We use assoc to look up an entry with the given color name. If we find the entry, we then extract the red, green, and blue components and turn them into an RGB color. If we don't find the entry, we return some default value (say rgb-transparent). In code, we write

;;; Procedure:
;;;   lookup-color-by-name
;;; Parameters:
;;;   cname, a color name
;;;   ctable, a list of color entries
;;; Purpose:
;;;   Looks up the color in the table.
;;; Produces:
;;;   color, an RGB color
;;; Preconditions:
;;;   Each entry in ctable must be a list.
;;;   Element 0 of each entry must be a string which represents a color name.
;;;   Elements 1, 2, and 3 of each entry must be integers which represent
;;;    the red, green, and blue components of each color, respectively.
;;; Postconditions:
;;;   If an entry for the name appears somewhere in the table, color is
;;;     the corresponding RGB color (computed from the components).
;;;   If multiple entries with the same name appear in the table, color
;;;     is the computed RGB color for one of them.
;;;   If no matching entries appear, color is rgb-transparent.
;;;   Does not affect the table.
(define lookup-color-by-name
  (lambda (cname ctable)
    (if (assoc cname ctable)
        (rgb-new (list-ref (assoc cname ctable) 1)
                 (list-ref (assoc cname ctable) 2)
                 (list-ref (assoc cname ctable) 3))
        rgb-transparent)))

For example,

> (lookup-color-by-name "white" named-colors)
16777215
> (lookup-color-by-name "scarlet" named-colors)
-1
> (equal? rgb-transparent (lookup-color-by-name "white" named-colors))
#f
> (equal? rgb-transparent (lookup-color-by-name "scarlet" named-colors))
#t

Note that the result depends on the directory. For example,

> (lookup-color-by-name "white" null)
-1

Some of you may recall asking why if might take a value other than #t or #f as a parameter. This procedure is one example of why.

Looking Up a Color, Revisited

One problem with the lookup-color-by-name procedure given above is that it calls assoc four times with identical parameters, once to determine whether the color name is in the list and once to determine each of the three components. Rather than calling assoc four times, we might call it once, name that result, and use it each time.

(define lookup-color-by-name
  (lambda (cname ctable)
    (let ((assoc-result (assoc cname ctable)))
      (if assoc-result
          (rgb-new (list-ref assoc-result 1)
                   (list-ref assoc-result 2)
                   (list-ref assoc-result 3))
          rgb-transparent))))

Building More Complex Entries

If you recall our initial plans for the color table, we wanted to store more than just a color name and its components. We also thought it would be useful to store attributes of the color, such as whether or not it is Web safe. Hence, we might update each entry to include that additional information, which we might represent as strings.

;;; Value:
;;;   named-colors
;;; Type:
;;;   List of lists.
;;;   Each sublist is of length at least four and contains a name,
;;;     red, green, and blue components, and a sequence of attributes.
;;;   Each component is an integer between 0 and 255, inclusive.
;;;   Everything else is a string.
;;; Contents:
;;;   A list of common colors, their components, and some attributes.
(define named-colors
  (list (list "black"         0   0   0  "bw" "web-safe")
        (list "blah grey"   153 153 153  "bw" "web-safe")
        (list "blood red"   102   0   0  "reds" "web-safe")
        (list "blue"          0   0 255  "primary" "rainbow" "web-safe")
        (list "green"         0 255   0  "primary" "rainbow")
        (list "indigo"      102   0 255  "rainbow" "web-safe")
        (list "medium grey" 128 128 128  "bw")
        (list "off white"   250 250 250  "bw")
        (list "orange"      255 119   0  "rainbow")
        (list "pale red"    255 240 240  "reds")
        (list "red"         255   0   0  "primary" "rainbow" "web-safe" "reds")
        (list "violet"       79  47  79  "rainbow")
        (list "white"       255 255 255  "bw" "web-safe")
        (list "violet red"  204  50 153  "reds")
        (list "yellow"      255 255   0  "rainbow" "secondary" "web-safe")))

You should note a few things about this revised list. First, we've left the red, green, and blue components in exactly the same place, so that lookup-color-by-name still works correctly. Second, we've made the association list easier to read by lining up corresponding values, as in a table. We're taking advantage of the fact that Scheme ignores extra spaces.

Implementing assoc

As is the case with many of the built-in Scheme procedures, assoc is relatively easy to implement ourselves. In essence, assoc is much like other algorithms we've written to look for a value in a list. That is, assoc recursively steps through the list until it finds a match or runs out of elements.

;;; Procedure:
;;;   assoc
;;; Parameters:
;;;   key, a Scheme value
;;;   alist, an association list
;;; Purpose:
;;;   Find an entry with key key in alist.
;;; Produces:
;;;   entry, a Scheme value
;;; Preconditions:
;;;   No additional
;;; Postconditions:
;;;   If there is an index, i, such that
;;;     (equal? key (car (list-ref alist i)))
;;;   then entry is the first such entry
;;;   Otherwise, entry is false (#f)
(define assoc
  (lambda (key alist)
    (cond
       ; If there are no entries left in the association list,
       ; there are no entries with the given key.
       ((null? alist) 
        #f)
       ; If the key we're looking for is the key of the first
       ; entry, then use that entry.
       ((equal? key (car (car alist))) 
        (car alist))
       ; Otherwise, look in the rest of the association list.
       (else 
        (assoc key (cdr alist))))))

Using Other Keys

The assoc procedure works fine if the key is the first element of a data item. But what if it's the second (or third, or fourth, or ...). For example, what if we want a color whose red component is 255 or 102? Then we can't rely on assoc, because it only looks at the first element of each list. Instead, we need to write our own procedure, using the same structure as above.

For example, to find a color with a particular red component, we might write

;;; Procedure:
;;;   lookup-color-by-red
;;; Parameters:
;;;   red, an integer between 0 and 255, inclusive.
;;;   ctable, a list of color entries.
;;; Purpose:
;;;   Find a color name in the table which has the specified
;;;   red component.
;;; Produces:
;;;   cname, a string (or the value #f)
;;; Preconditions:
;;;   Each entry in ctable must be a list.
;;;   Element 0 of each entry must be a string which represents a color name.
;;;   Elements 1, 2, and 3 of each entry must be integers which represent
;;;    the red, green, and blue components of each color, respectively.
;;; Postconditions:
;;;   If an entry with the same red value appears somewhere in the table,
;;;     cname is the name of that entry.
;;;   If multiple entries with the same red value appear in the table, 
;;;     cname is the name of one such value.
;;;   If no matching entries appear, cname is #f.
;;;   Does not affect the table.
(define lookup-color-by-red
  (lambda (red ctable)
    (cond
      ((null? ctable)
       #f)
      ((equal? red (cadr (car ctable)))
       (car (car ctable)))
      (else
       (lookup-color-by-red red (cdr ctable))))))

It might even make sense to generalize this procedure, so that we can look up by any of the three components.

;;; Procedure:
;;;   lookup-color-by-component
;;; Parameters:
;;;   component, an integer between 0 and 255, inclusive
;;;   position, an integer between 1 and 3, inclusive
;;;   ctable, a list of color entries.
;;; Purpose:
;;;   Find a color name in the table which has the specified
;;;   component (1 for red, 2 for green, 3 for blue).
;;; Produces:
;;;   cname, a string (or the value #f)
;;; Preconditions:
;;;   Each entry in ctable must be a list.
;;;   Element 0 of each entry must be a string which represents a color name.
;;;   Elements 1, 2, and 3 of each entry must be integers which represent
;;;    the red, green, and blue components of each color, respectively.
;;; Postconditions:
;;;   If position is 1 and an entry with the same red value as component
;;;     appears somewhere in the table, cname is the name of one such entry.
;;;   If position is 2 and an entry with the same green value as component
;;;     appears somewhere in the table, cname is the name of one such entry.
;;;   If position is 3 and an entry with the same blue value as component
;;;     appears somewhere in the table, cname is the name of one such entry.
;;;   If no matching entries appear, cname is #f.
;;;   Does not affect the table.
(define lookup-color-by-component
  (lambda (component position ctable)
    (cond
      ((null? ctable)
       #f)
      ((equal? component (list-ref (car ctable) position))
       (car (car ctable)))
      (else
       (lookup-color-by-component component position (cdr ctable))))))

We can now define lookup-by-red (and even lookup-by-green and lookup-by-blue) in terms of lookup-by-component.

(define lookup-by-red
  (lambda (red ctable)
    (lookup-by-component red 1 ctable)))
(define lookup-by-green
  (lambda (green ctable)
    (lookup-by-component green 2 ctable)))
(define lookup-by-blue
  (lambda (blue ctable)
    (lookup-by-component blue 3 ctable)))

What if we want to return all the colors with a particular component, or permit colors where the component is “close enough” to the desired component? We'll explore that issue in lab.

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-10 Janet Davis, Matthew Kluber, Samuel A. Rebelsky, and Jerod Weinman. (Selected materials copyright by John David Stone and Henry Walker and used by permission.)

This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.