CSC 151 Grinnell College Fall, 2005
 
Fundamentals of Computer Science I
 

Supplemental Problems

Problems:

1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20

Supplemental Problems extend the range of problems considered in the course and help sharpen problem-solving skills. Problems numbered 5 or higher may be turned in for extra credit.

Format for Supplemental Problems

In turning in supplemental problems the course, please follow these directions:

  1. The first three lines of any Scheme program should be comments containing your name, your mailbox number, and an identification of assignment being solved. For example:

    
        ;;; Henry M. Walker
        ;;; Box:  Science Office
        ;;; Supplemental Problem 1
    

    Also, comments are needed for every procedure, providing English statements in each of the following areas:

    
        ;;; Given
        ;;; Result
        ;;; Precondition(s)
        ;;; Postcondition(s)
    
  2. Prepare, test, and debug your program thoroughly.

  3. Once the program is complete to your satisfaction, print a copy of your Scheme definitions and tests.

  4. Run your program within Dr. Scheme, and print the results from the interactions window.

  5. It is essential that your output correspond with your definitions file! When you turn in your work, you are certifying that these two printouts belong together. Thus, if specified output could not result from the submitted definitions file, the discrepency might be interpreted as grounds for investigation of academic dishonesty, as outlined in the Student Handbook.

    Annotate both the printout of your test cases from the definitions file and the output from the interactions window, so it is clear what output comes from which tests.

  6. Either write on your printout or include a separate statement that argues why your program is correct, based upon the evidence from your test runs.

Very Important Reminder:

The course syllabus states, "...Since a primary goal of the course is to enable students to develop their own programs, collaboration is not allowed on homework assignments, supplemental problems, or tests. In addition, students should note the department's policy regarding the role of user-consultants for Computer Science 151 ."

Students with any questions about any of the supplemental problems should talk to their instructor prior to beginning this problem. Accordingly, following faculty legislation, the CSC 151 instructors will have to turn over to the Academic Standing Committee any evidence of collaboration found on any supplemental problem.

Supplemental Problem 1:

  1. Write and test a Scheme procedure adjacent-elements that takes any non-empty list ls and returns a list of two-element lists, each two-element list consisting of two adjacent elements of ls.
    
    (adjacent-elements '(a b c d e)) ===> ((a b) (b c) (c d) (d e))
    (adjacent-elements '(5 12 12 0)) ===> ((5 12) (12 12) (12 0))
    (adjacent-elements '(first second)) ===> ((first second))
    (adjacent-elements '(only)) ===> ()
    

Supplemental Problem 2:

A generalized numeric list is a number or a list of elements, for which each of the elements are generalized numeric lists. Here are some examples:


   12                              number
   (7 12 -5)                       list of elements; each element is a number
   (7 (12 -5) 2 (((8))))           list of generalized numeric lists
   (2 ( 3 ( 4 (5 (6) 7) 8) 9) 10)  list of generalized numeric lists

Write a procedure add3 that takes a generalized numeric list as parameter and adds three to every number. The structure of the overall list should remain the same - only the numbers should increase by 3. For example, the following sows add3 applied to the above examples:


   > (add3 12)
   15
   > (add3 '(7 12 -5))
   (10 15 -2)
   > (add3 '(7 (12 -5) 2 (((8)))))
   (10 (15 -2) 5 (((11))))
   > (add3 '(2 ( 3 ( 4 (5 (6) 7) 8) 9) 10))
   (5 (6 (7 (8 (9) 10) 11) 12) 13)

Supplemental Problem 3: Grading Passwords

Since many modern computer systems use passwords as a means to provide protection and security for users, a major issue can be the identification of appropriate passwords. The main point should be to choose passwords that are not easily guessed, but which the user has a chance of remembering. For example, passwords related to birthdays, anniversaries, family names, or common words are all easily guessed and should be avoided.

Some common guidelines suggest that a password should contain at least 6 characters and include characters from at least three of the following categories:

Other guidelines indicate that elements of passwords should be pronounceable. One simple measure of this guideline suggests that any group of letters in a password should contain both vowels and consonants.

In Scheme, some procedures are already built-in to test some of these conditions. Specifically, the Scheme standard identifies the procedures char-alphabetic?, char-numeric?, char-whitespace?, char-upper-case?, and char-lower-case?. See the section on characters in the Scheme standard for more details.

  1. The first part of this supplemental problem is to write additional procedures for other categories of characters. (Some of these procedures were part of the lab on characters and may be reused.)



    Note: While you are free to use built-in Scheme predicates for these procedures, you may not use type conversion procedures. Thus, you are not allowed to convert characters to integers or other data types.

  2. The second part of this problem is to write a tail-recursive higher-order procedure that identifies the pattern of checking whether a string contains a character meeting a particular test. Specifically, write a higher-order procedure
    
       (contains? pred?)
    

    which returns a function of one parameter, a string. When given a string parameter, the result will apply the predicate pred? to each character in string str and which return 1 if some character meets this predicate test and 0 otherwise. Here are several examples:

    
    (define contains-upper  (contains char-upper-case?))
    (define contains-lower  (contains char-lower-case?))
    (define contains-number (contains char-numeric?))
    (define contains-punc   (contains punctuation?))
    
    (contains-upper  "Walker") ===> 1
    (contains-lower  "Walker") ===> 1
    (contains-number "Walker") ===> 0
    (contains-punc   "Walker") ===> 0
    


    Note: To meet the specifications below, a procedure using contains? must return 1 or 0, not #t, #f, or another number.

    Also note: If you can, this procedure should handle strings as strings, rather than converting a string to another data type such as a list.

  3. The third part of this problem is to write procedure password-grade which takes a string as parameter and which gives a grade to that password according to the following grading scale:

According to this scale, "Walker" would receive a "B" grade, with points for string length, vowel, consonant, upper-case letter, and lower-case letter.

After writing and debugging your procedures, print them out and also include the results of several test cases. Then, in considering the correctness of your code, write out (in a paragraph) what cases each procedure might encounter, identify test cases to cover each of those cases, indicate which tests in the output file correspond to the circumstances you identified in your commentary, and comment on the extent to which your code handles those cases correctly.

Supplemental Problem 4: A Faculty Directory

File /home/walker/public_html/courses/151.fa05/math-cs-faculty-98.ss contains a listing of the faculty of the Department of Mathematics and Computer Science for Fall 1998. (The 1998-1999 directory is used in this problem, because two people - Nathaniel Borenstein and Pamela Ferguson - shared an office that semester; and two people - Emily Moore and Thomas Moore - had two different offices, but the same last name.)

In this directory, the entry for an individual is a list of the form:


   (firstname lastname title e-mail phone office)

These entries then are placed on a list, so the full directory is a list of the above entries. Within a Scheme program on MathLAN, this directory may be loaded with the command:


   (load "/home/walker/public_html/courses/151.fa05/math-cs-faculty-98.ss")

After loading this file, the symbol fac-directory is defined as the directory.

  1. Write a procedure search-name that has these parameters:

    search-name performs a linear search of the faculty directory. If the procedure is called with just a last name, then the procedure prints (using display or write) all faculty who have the given last name. If the procedure is called with both a last name and a first name, then the procedure prints the entry of the given faculty member. (In either case, nothing is printed if no entries in the directory match the name(s) given.)

  2. Place the entries of fac-directory into a vector fac-vector. Then write a procedure email-sort that sorts the entries of fac-vector, so that they are ordered by e-mail address.

  3. Write a procedure search-email that has one parameter, an e-mail address, and performs a binary search to find and print the entry of the directory with the given e-mail address. If no e-mail address is found, the procedure should print "not found".

Note: In this problem, you are free to use (with proper citation) any code (e.g., sort or search procedure(s)) found in the course's labs. Since the tasks here require printing, some modifications may be needed, but this problem does not require extensive rewriting.


Extra Credit Problems

Any of the following problems may be done for extra credit. As noted in the course syllabus, however, a student's overall problems' average may not exceed 120%.

Students are reminded that collaboration is NOT allowed on the following problems, as stated in the course syllabus and reiterated at the start of these Supplemental Problems.

Supplemental Problem 5: Processing Dates

A common processing task involves analyzing dates, such as January 8, 2003. In particular, one often must determine if a date is the last one in a year or the last one in a month. In a related task, one may need to determine the number of days from the beginning of the year to a given date, including January 1 and the given date in the count.

For this problem, dates will have three parts:
month day year,
where the month and day are integers and the month is a symbol (e.g., january, february, etc.). Write the following procedures:

Notes:

  1. January, March, May, July, August, October, and December have 31 days.

  2. April, June, September, and November have 30 days.

  3. February has 28 days in non-leap years, but 29 days in leap years. A leap year occurs in years when the year is divisible by 4, except that century years are not leap years unless they are divisible by 400. Thus, the years 1999 and 1900 are not leap years, while 1996 and 2000 are leap years.

Examples: These procedures should produce the following results:


(year-end? 'march 31 2004)      ===> #f
(year-end? 'december 30 2004)   ===> #f
(year-end? 'december 31 2004)   ===> #t
(month-end? 'january 8 2004)    ===> #f
(month-end? 'january 31 2004)   ===> #t
(month-end? 'february 28 2003)  ===> #t
(month-end? 'feburary 28 2004)  ===> #f
(month-end? 'february 28 2000)  ===> #f
(day-of-year 'january 31 2004)  ===> 31
(day-of-year 'february 1 2004)  ===> 32
(day-of-year 'february 28 2003) ===> 59
(day-of-year 'march 1 2003)     ===> 60
(day-of-year 'march 1 2004)     ===> 61

Supplemental Problem 6: A Gambling Simulation

In a private game, a gambler sets the following personal limits. The gambler starts the evening with $20; he bets $2 on each game and stops when he either runs out of money or has a total of $50. To be more specific, for a specific game, the gambler bets $2. If the gambler loses the bet, then $2 is deducted from his account. If the gambler wins the bet, then the gambler wins a payoff amount, and the gambler's new balance is increased by that payoff — the $2 is not deducted. For example, if the payoff is $5 and if the gambler starts the game with $20, then the gambler's new balance would be $18 if the gambler loses a bet and $25 if the gambler wins the bet.

The following problems will allow you to investigate the likelihood of the gambler winning by simulating games each evening over a period of 1000 nights of gambling. To accomplish this simulation, you are to proceed in three steps:

  1. Write a procedure game which simulates a single bet. Thus, game should have parameters for the gambler's purse amount before the bet, the payoff from a $2 bet, and the probability of the gambler winning the single bet. Then game should return the amount the gambler has after that single bet, using a random number generator and the probability to determine if the gambler won. For example,

    
       (game  37  5  0.7)
    

    should represent the result of a bet, when the gambler starts with $37, when the return on a $2 bet is $5, and when the gambler has a 70% chance of winning the bet. With these values, game should return 35 if the gambler loses the bet and 42 if the gambler wins.

    Note that one way to simulate a bet is to compare (random 100) with the probability 0.7 (70%):

    
       (cond ((< (random 100) 70) ;winning part
                  )
             (else ;losing part
                  )
        )
    
  2. Modify game to obtain a procedure evening as follows:

    For example, the call

    
       (evening  5  0.7)
    

    should simulate an evening of gambling, where the balance begins at $20, the payoff amount is $5, and the probability of winning a game is 0.7 . The result of this call will be the conclusion "won" or "lost".

  3. Using evening, write a procedure simulate-evenings which takes the payoff from a $2 bet, the probability of winning a single bet, and a number of evenings to be played as parameters and returns the number of evenings won. For example, if the gambler consistently plays (fair) games which have a $2 payoff and a 50% chance of winning, and if the gambler plays 1000 evenings of gambling, then an appropriate procedure call would be:

    
       (simulate-evenings  2  0.5  1000)
    

    Similarly, if the gambler consistently plays (fair) games which have a $4 payoff and a 1/3 change of winning, and if the gambler plays 500 evenings of gambling, then an appropriate procedure call would be:

    
       (simulate-evenings  4  1/3  500)
    

    Hint: Since the number of evenings to be played is the new element distinguishing simulate-evenings from evening, you might focus upon that parameter as the basis for processing in simulate-evenings. An appropriate call to the evening procedure already will handle other details of betting for an evening.

  4. After you have written your procedures, be sure to test them in several cases. What test cases and results can help you determine that your code might be correct?

Supplemental Problem 7: Selecting Reviewers for a Paper

For many professional conferences, authors submit papers on their research. These papers are then sent to reviewers for comments, and the best papers are selected for presentation. A similar process is used for determining papers to appear in journals. This problem considers how reviewers might be selected.

For a computer-related conference, organizers maintain a database of reviewers, together with a list of the subject areas these reviewers feel competent to judge. One possible structure for this database is a list of lists. File /home/walker/151s/labs/reviewer-directory.ss defines Scheme variable directory that contains a ficticious version of such a database. The first part of this list of lists is:


(define directory
 '(("Terry Clark" "Networks" "Distributed Systems" "Distributed Systems") 
   ("Carol Walker" "Cryptography" "Software Design" "Multimedia"
         "Algorithms") 
   ("John McClelland" "Software Design" "Distributed Systems" "Networks"
         "Ethical/Social Issues" "Theory of Computation" "Algorithms") 
   ("Lisa Dale" "Networks" "Distributed Systems" "Theory of Computation" 
          "Ethical/Social Issues") 
   ("Arnold Freeman" "Operating Systems" "Databases" "Networks" 
          "Distributed Systems" "Architecture")  
   ("Terry Barnes" "Networks" "Artificial Intelligence" "Cryptography"
         "Architecture") 
...
  )
)

This entire list of lists may be loaded within a Scheme program with the statement:


(load "/home/walker/151s/labs/reviewer-directory.ss")

When a paper is submitted to the conference, the author specifies several related subject areas. In order to facilitate reviewing, the conference organizers wish to find reviewers whose expertise as many of the author-designated subject areas as possible.

New Alternatives (added 3/30/04)

Write either procedure find-all-reviewers or find-reviewers, as described below.
(Previously, this problem included only Option B; in this revision, you have another option.)

Option A

Define a Scheme procedure find-all-reviewers with the following properties

To be on the resulting list of reviewers, an individual may have many interests beyond those indicated for the paper -- as long as the paper's topics are all covered.

Option B

Define a Scheme procedure find-reviewers with the following properties

For example, consider the procedure call


(find-reviewers 5 "Networks" "Databases" "Artificial Intelligence") 

Results of this call should be as follows:

Note:

While this problem specifies the "best" reviewers for a paper, a similar approach might be applied to a roommate-matching program in which potential roommates list various traits and the program finds one or more roommates with the most traits in common. Dating services might use a similar algorithm as well.

Supplemental Problem 8: Multiplication of Three-Digit Integers

Write a procedure show-multiplication that reads two three-digit integers from the keyboard and then prints their product in the following format:


Enter first number:  749
Enter second number: 381


        749
    x   381
    –––––––
        749
      5992
     2247
    –––––––
     285369

Use the following cases as part of your testing:


(show-multiplication 749 381)
(show-multiplication 381 749)
(show-multiplication 599 100)
(show-multiplication 120 102)
(show-multiplication 102 120)

Supplemental Problem 9: Information on the 1997-1998 Iowa Senate

File /home/walker/151s/labs/ia-senate contains information about the members of the 1997-1998 Iowa Senate. After a title line and a blank line, a typical line has the following form:


Angelo          Jeff        44      Creston           IA 50801
Kramer          Mary        37      West Des Moines   IA 50265
Lundby          Mary        26      Marion            IA 52302-0563

Thus, a typical line gives the last name, the first name, the district number, the town of residence, the state (always IA), and the town's zip code. The information in these lines is arranged in columns.

Design and write a Scheme program that reads in data from this file and creates two output files, senators-by-city and senators-by-zip-code, in the current working directory. The senators-by-city file should contain the same data as the source file, in the same format (including capitalization), but with the lines arranged alphabetically by city (column 4). The other file, senators-by-zip-code, should contain a list of all senators in the following format


Jeff Angelo
Creston, IA 50801

A blank line should appear after each senator and city address. In this format, the name appears on a first line (first name, then last), and the city, a comma, the state, and zip code is on the next line — separated by single spaces in the format shown. Note that a variation of this format (with a street address, if available) might be used for a mailing label.

Supplemental Problem 10: List Endings

Write and test a Scheme procedure endings that takes any non-empty list ls and returns a list of all of the non-empty ending lists of ls.


(endings '(a b c d e)) ===> ((a b c d e) (b c d e) (c d e) (d e) (e))
(endings '(5 12 12 0)) ===> ((5 12 12 0) (12 12 0) (12 0) (0))
(endings '(first second)) ===> ((first second) (second))
(endings '(only)) ===> ((only))

Supplemental Problem 11: Fibonacci Numbers

In mathematics, the Fibonacci sequence is a sequence of positive integers, as follows.

Thus, the first 10 elements of the Fibonacci sequence are: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55

Write and test a tail-recursive Scheme procedure Fibonacci which takes a parameter n and returns the nth Fibonacci element in the sequence. Also, write a test a procedure Fib-seq which takes a parameter n and returns a list containing the first n elements of the Fibonacci sequence.


(Fibonacci 10)  ===> 55
(Fib-seq 10)    ===> (55 34 21 13 8 5 3 2 1 1)

Supplemental Problem 12: Determination of the Following Date

Programs commonly must determine what date comes after a given one. Write a procedure next-date which returns the date which follows the specified one (e.g., April 1, 1999 follows March 31, 1999). The date returned should be formatted as a list.

If next-date is given an invalid date as a parameter, it should return an error message rather than a date.

The following examples illustrate how next-date should work:


(next-date 'january 8 1999)     ===> (january 9 1999)
(next-date 'february 28 1999)   ===> (march 1 1999)
(next-date 'february 28 2000)   ===> (february 29 2000)
(next-date 'february 29 1999)   ===> "invalid date"
(next-date 'december 31 1999)   ===> (january 1 2000)
(next-date 'henry 31 2000)      ===> "invalid date"

Supplemental Problem 13: Unusual Canceling

The fraction 64/16 has the unusual property that its reduced value of 4 may be obtained by "canceling" the 6 in the numerator with that in the denominator. Write a program to find the other fractions whose numerators and denominators are two-digit numbers and whose values remain unchanged after "canceling."

Of course, some fractions trivially have this property. For example, when numerator and denominator are multiples of 10, such as 20/30, one can always "cancel" the zeroes. Similarly, cancellation is always possible when the numerator and denominator are equal, as in 22/22. Your program should omit these obvious cases.

Supplemental Problem 14: Roman Numerals

Write a procedure that reads an integer between 1 and 1000 from the keyboard and prints the equivalent number in Roman numerals.

Supplemental Problem 15: Anagrams

Sometimes one can simplify a problem by removing the parts that don't matter, and then looking at what's left.

For instance if you wanted to figure out if two collections of "stuff" were the same, you might remove matching items from each collection until you see if there are items left over. If you have leftover items, the collections were different, and if both collections become empty at the same time, they are identical.

Use this technique to write a program which will determine whether or not two strings are anagrams of each other.

Test it by deciding whether or not "one plus twelve" is the same as "eleven plus two" .

Supplemental Problem 16: Game Simulation Exercise

This exercise involves the completion of one of the following problems:

  1. Racquetball: Racquetball is a game played by two players on an indoor, enclosed court. Scoring proceeds as follows:

    A player can only score points while she has the serve. A player loses the serve when she loses a volley, but no points are scored on the change of serve. Play continues until either the score is 11-0, which is a shut-out, or one player scores 21 points. (The rules do not require a player to win by two points.)

    Write a program that reads the probability of Player A winning a volley and then simulates the playing of 500 games with Player A having the first serve on each game. Record the number of wins (including shut-outs) for each player and the percentage of wins. Also record the number of shut-outs for each player.

  2. Volleyball:Volleyball is a game played on a court by two teams, separated by a net. Scoring proceeds much the same way as in racquetball (as explained above). In particular, scoring starts at 0-0. A team can only score points while it serves. A team loses the serve when it loses a volley, but no points are scored on the change of serve. Play continues until one team scores 15 points, and a team must win by at least two points (if the score is 15-14, play must continue until one team leads by 2 points). There is no special rule for ending a game due to a shut-out.

Write a procedure that has as parameter the probability of Team A winning a volley and then simulates the playing of 500 games with Team A having the first serve on each game. The procedure should print the number of wins for each team and the percentage of wins. The procedure also should print the number of shut-outs for each team.

Hint: Since the flow of activity in this problem is a bit complex, you might write an initial outline describing the overall flow of work for the simulation and/or within a game. Then write main procedures which follow this overall structure and which call other procedures to handle specific details. For example, an overall simulation procedure for the entire 500 games might call an individual game procedure to find the result of each game. The overall simulation procedure then would only have to tabulate results — the individual game procedure would handle details of a game.

Supplemental Problem 17: Reading From A File

Gemstones are attractive forms of rock crystal, commonly used for decoration and in jewelry. Gemstones also have interesting mineral properties. Gemstones may be classified in a variety of ways, including chemical composition, crystal structure, color, specific gravity, refractive index, and hardness:

  1. Chemical Composition: While some gemstones are primarily composed of atoms of one element (e.g., diamonds are mostly carbon, with coloring coming from traces of other elements), other gemstones are made up of atoms of several atoms (e.g., mica molecules include oxygen, hydrogen, silicon, aluminum, iron, and/or many others). On-line sources of information include general references (e.g., Common Mineral Groups) and references to specific minerals (e.g., micas).

  2. Color may be classified informally (e.g., red, yellow, etc.) or more formally by viewing thin slices of mineral crystals through the microscope, using polarized light (see, for example, Minerals under the Microscope).

  3. Specific Gravity is a measure of the density of a mineral. More precisely, specific gravity is the ratio of the weight of the mineral in air to its weight in an equal volume of water. More details are available from various on-line sources (see, for example, John Betts Fine Minerals for specific gravity.

  4. Refractive Index provides a measure of how much light bends within a crystal. The higher the refractive index, the more bending and the more brilliant a crystal is likely to appear. For more information, see various on-line sources, such as Refractive Index.

  5. Crystal Structure: Crystals typically have one of several standard shapes or structures, including cubic, tetragonal, orthorhombic, hexagonal, monoclinic, and triclinic. While the details of such structures are beyond the scope of this problem, the World Wide Web contains many useful references, including crystal forms (at the macro-level) and the (atomic-level) representation of structures prepared as part of lecture series by S. J. Heyes.

  6. Hardness often is measured on the (nonlinear) Mohs Scale, which associates a hardness number to each mineral, from 1 (softest) to 10 (hardest):

    1. Talc
    2. Gypsum
    3. Calcite
    4. Fluorite
    5. Apatite
    6. Orthoclase
    7. Quartz
    8. Topaz
    9. Corundum
    10. Diamond

    As a comparison, a fingernail has hardness 2.5, glass has hardness 5.5, and a steel file has hardness 6.5. Minerals of the same hardness should not scratch each other, but a mineral of one hardness will scratch minerals with a lower hardness number.

File /home/walker/151s/labs/gems.txt contains information on several gemstones, including color, hardness, specific gravity, and refractive index. Within the file, each line contains information about a specific gemstone.

Here are a couple of sample lines, and a character 'ruler' to show how wide the fields are:

          11111111112222222222333333333344444444445555555555666666666677777
012345678901234567890123456789012345678901234567890123456789012345678901234

                Zircon        RED           7.5         4.50         1.95
                 Topaz     YELLOW             8         3.53         1.62

To clarify, the names of the gemstones come first in a line and are right-justified in a column. The colors come next, followed by hardness (on a scale 1 to 10), then specific gravity, and finally refractive index (generally between 1.3 and 2.5).

Write a program that will let you extract the names of gemstones of a certain color and at least a particular hardness.

If this program is invoked as:

(find-by-color-and-hardness 8 "RED" "/home/walker/151s/labs/gems.txt")

it should return:

               Diamond        RED            10         3.52         2.42
        Topaz (yellow)        RED             8         3.53         1.63
                Spinel        RED             8         3.60         1.72
        Spinel (synth)        RED             8         3.63         1.73
           Chrysoberyl        RED           8.5         3.71         1.75
              Corundum        RED             9         3.99         1.77
               Painite        RED             8         4.01         1.80
        Cubic Zirconia        RED             8         5.75         2.15

Extra credit: Display only the name, color and hardness fields, and left justify the name, instead of right justifying it:

Diamond             RED            10
Topaz (yellow)      RED            8
Spinel              RED            8
Spinel (synth)      RED            8
Chrysoberyl         RED            8.5
Corundum            RED            9
Painite             RED            8
Cubic Zirconia      RED            8

Supplemental Problem 18: City Data

The file /home/walker/151p/labs/lab26.dat contains several items of information about large American cities. More specifically, in /home/walker/151p/labs/lab26.dat , each entry consists of the name of the city (line 1), the county or counties (line 2) and the state (line 3) in which it is situated, the year in which it was incorporated (line 4), its population as determined by the census of 1980 (line 5), its area in square kilometers (line 6), an estimate of the number of telephones in the city (line 7), and the number of radio stations (line 8) and television stations (line 9) serving the city. Thus a typical entry reads as follows:


Albuquerque
Bernalillo
New Mexico
1891
331767
247
323935
14
5

A blank line follows each entry, including the last.

Write a procedure which has a filename as parameter and which answers the following questions about the cities represented in the data files.

  1. Which of those cities has the highest population density (population divided by area)?

  2. Which of these cities has over one million telephones?

  3. Which city has the lowest per capita number of radio and television stations (together)?

The answers to each of these questions should be printed neatly and clearly by the procedure.

Supplemental Problem 19: File Analysis

Write a procedure file-analysis that takes the name of a file as its argument, opens the file, reads through it to determine the number of words in each sentence, displays the total number of words and sentences, and computes the average number of words per sentence. The results should be printed in a table (at standard output), such as shown below:


     This program counts words and sentences in file "comp.text ".

     Sentence:  1    Words: 29
     Sentence:  2    Words: 41
     Sentence:  3    Words: 16
     Sentence:  4    Words: 22
     Sentence:  5    Words: 44
     Sentence:  6    Words: 14
     Sentence:  7    Words: 32

     File "comp.text" contains 198 words words in 7 sentences
     for an average of 28.3 words per sentence.

In this program, you should count a word as any contiguous sequence of letters, and apostrophes should be ignored. Thus, "word", "sentence", "O'Henry", "government's", and "friends'" should each be considered as one word.

Also in the program, you should think of a sentence as any sequence of words that ends with a period, exclamation point, or question mark.
Exception: A period after a single capital letter (e.g., an initial) or embedded within digits (e.g., a real number) should not be counted as being the end of a sentence.
White space, digits, and other punctuation should be ignored.

Supplemental Problem 20: Files for the Placement Program

The expert systems lab describes a program which provides a tentative placement of incoming college students for their first mathematics and computer science courses.

When this system is used each year to make recommendations to incoming students, the Registrar's Office sends the Department of Mathematics and Computer Science a file of student transcript information. A fictional file of this type is available in /home/walker/261/labs/student.raw-data.

Since the expert system is written in LISP, which is list-oriented, the placement program takes input from a file containing data for each student as a separate list. For example, a typical entry might be:

("Person I M A" ( (ACT 28) (SemOfEng 8)(SemOfMath 8)
         (Grades 2.75)(SemOfPCalc 1)(PCalcGrades 3.00)
         (SemOfCS 2)(CSGrades 4.00) (TSemUnits 36) ) 
     (Campus-Box |20-01| ) (Adviser "My Friend") )

The full file for the above fictional students is available in ~walker/261/labs/student.data. Here, the person's name is the first entry in the list. The second list component is a list of attributes. The person's mailbox and advisor, when known, are the remaining elements of the main list.

Within the fictional student file, it is worthwhile to note several characteristics which add complexity to this placement problem:

Such variations are common in applications involving expert systems.

Note that the list of attributes is a list of pairs, where each entry (e.g., (ACT 28)) gives the type of information (e.g., ACT first, followed by the data). Such a list of pairs is called an association list, and such lists are a particularly common mechanism for storing data within expert systems. The last part of the lab on pairs describes one common Scheme procedures for processing data within an association lists.

This example of files illustrates a common circumstance in computing, where data from one application (e.g., the Registrar's file) comes in one format, while data for another application (e.g., the placement program) requires a second format.

Problem: Write a program which reads a data file in the Registrar's format and produces a corresponding file in the list-oriented format.


This document is available on the World Wide Web as

http://www.cs.grinnell.edu/~walker/courses/151.fa05/suppl-prob.shtml

created 6 February 1997
last revised 26 October 2005
Validated as HTML 40.1 by the World Wide Web Consortium Cascading Style Sheet validated by the World Wide Web Consortium
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.