Summary: In this lab, you will have the opportunity to explore some of
the visualizations available through DrRacket’s
plot package along
with our data sets.
a. Do the traditional lab preparation. That is,
- Start DrRacket.
- Check for update the
- Require the
b. Also require the
plot package with
c. Load the list of cities arranged by zip codes.
(define zips (read-csv-file "/home/username/Desktop/us-zip-codes.csv"))
d. If you haven’t done so, save a copy of the Project Gutenberg version of Jane Eyre on your desktop.
e. Add the following undocumented procedures to your definitions pane.
(define zip-ends-with (lambda (city three-char-suffix) (string=? (substring (car city) 2) three-char-suffix)))
(define zip-starts-with (lambda (city three-char-prefix) (string=? (substring (car city) 0 3) three-char-prefix)))
f. Create four different small subsets of the
zips data using
(define zips1 (filter (section zip-ends-with <> "021") zips)) (define zips2 (filter (section zip-ends-with <> "606") zips)) (define zips3 (filter (section zip-starts-with <> "021") zips)) (define zips4 (filter (section zip-starts-with <> "606") zips))
g. Add the following undocumented procedure to your definitions pane.
(define useful-entry? (lambda (entry) (and (real? (cadr entry)) (real? (caddr entry)))))
h. Explain to yourself why
useful-entry? is likely to be useful.
Exercise 1: Plotting cities
filter, write an expression that selects only the elements
zips1 that contain a latitude and longitude.
> (define valid1 (filter ... zips1))
map1, extract only the latitude and longitude from that
list. (You may want to write a separate helper that extracts a latitude
and longitude from a single entry.)
> (define lat-long-1 (map1 ... valid1))
points, display the points.
> (plot (points ...))
d. Repeat those steps with
Since latitude and longitude are angles, rather than x and y coordinates, this approach is imperfect. But it will suffice for our experiments.
Exercise 2: Plotting cities, revisited
a. Write an expression or series of expressions that plots the first
two sets of points, using one color for the valid entries in
and another for the valid entries in
b. Do you expect to see something similar or different for the entries in
c. Check your answer experimentally. Then discuss with your partner any differences you see.
Exercise 3: Plotting cities, re-revisited
a. Write an expression or expressions to plot the cities in zips1 so that those north of 39.72 are one color and those south of 39.72 are another color. For example, those north of 39.72 might be blue and those south of 39.72 might be gray.
b. Write an expression or expressions to plot the cities in zips1 and
zips3 using four colors: one for
zips1 north of 39.72, one for
zips1 south of 39.72, one for
zips3 north of 39.72, and one for
zips3 south of 39.72.
Exercise 4: Detour: Exploring colors
Here’s a simple expression to plot some points.
> (plot (list (points (list (list 0 0) (list 10 10) (list 3 5) (list 1 4)) #:fill-color "red" #:sym 'fullcircle6) (points (list (list 5 10) (list 6 9) (list 8 7)) #:fill-color "black" #:sym 'fullcircle6) (points (list (list 1 1) (list 2 3) (list 3 5)) #:fill-color "blue" #:sym 'fullcircle6)))
In addition to color names, DrRacket lets you use RGB triplets: Lists
of three integers, as in
#:fill-color (list 200 10 180).
Experiment with a few triplets to find five or so colors you find useful as a set.
Exercise 5: Categorical data
In a recent lab, you wrote a procedure something like the following.
(define categorize (lambda (city) (cond [(not (useful-entry? city)) "Unknown"] [(> (cadr city) 39.72) "North"] [(< (cadr city) 39.72) "South"] [else "Other"])))
categorize, create summary
zips1. Here’s one possible output.
> (.... zips1) '(("North" 27) ("Unknown" 1) ("South" 31))
discrete-histogram, make a histogram of these
c. Repeat your work for
d. Repeat your work for
e. Given those results, how representative do you feel your sample data are?
Exercise 6: Tallying different types
a. Write a procedure,
tally-alphabetic, that, given a list of characters,
determines how many are alphabetic.
> (tally-alphabetic (list #\a #\b #\3 #\d)) 3 > (tally-alphabetic (string->list "a and b3 & q4")) 6
Hint: One approach is to filter the alphabetic characters and then find out how long the list is.
char-alphabetic? is a built-in Scheme procedure.
b. Write a procedure,
tally-digits, that, given a list of characters,
determines how many are digits.
> (tally-digits (list #\a #\b #\3 #\d)) 1 > (tally-digits (string->list "a and b3 & q4")) 2
char-numeric? is a built-in Scheme procedure.
c. Write a procedure,
tally-whitespace, that, given a list of characters,
determines how many are whitespace.
> (tally-whitespace (string->list "a and b3 & q4")) 4
char-whitespace? is a built-in Scheme procedure.
d. Write a procedure,
tally-other, that, given a list of characters,
determines how many are neither alphabetic, nor digits, nor whitespace.
> (tally-other (string->list "a and b3 & q4")) 1
e. Write a procedure,
char-tallies, that, given a string, produces
a list of four numbers corresponding to the four numbers above.
> (char-tallies "a and b3 & q4") '(6 2 4 1)
Exercise 7: Visualizing tallies
a. Write a procedure,
explore-strings, that takes a list of strings as
input and produces a stacked histogram of the distribution of characters
in the strings using
(define explore-strings (lambda (strings) (plot (stacked-histogram (map1 (lambda (str) (cons "" ...)) strings)))))
explore-strings on a few sample inputs.
> (explore-strings (list "Now is the time for all good men to come to the aid of their country." "A 1 and a 2 and a 3 and ...." "'Twas brillig and the slithy toves; did gyre and gimble in the wabe."))
explore-strings on lines 100-110 of Jane Eyre.
Exercise 8: Exploring strings, revisited
Arrange for the histogram you created in the previous exercise to have an appropriate legend, title, and other labels.
For those with extra time
Extra 1: Other groupings
zips2 by selecting the entries whose last
three digits of zip code match.
Create two other lists,
zips4, in which you select the
entries whose first three digits match. Use “021” and “606” as the
a. What do you expect to happen when we plot the four sets of data?
b. Check your answer experimentally.
Extra 2: Side-by-side histograms
Skim [the DrRacket documentation on histograms].
Using the ideas contained therein, show the north-south histograms
zips4 in one diagram that makes
it easier for the reader to understand how they relate.
Extra 3: Side-by-side histograms
a. What do you expect to happen if we add
zips to the solution above?
b. Check your answer experimentally.
c. You should observe that the large list of zips so dominates that the others become almost invisible. How might you solve this problem?
d. Discuss your answer with a teacher or mentor.
e. Implement your solution (or the one your teacher or mentor suggests).