# Binary Search

Summary: In this laboratory, we explore different issues related to searching.

## Exercise 0: Preparation

a. Copy the `binary-search` procedure and the two lists of objects from the end of this lab into your definitions window.

b. Read through the definition of `binary-search`, and make sure that you understand the role of `get-key` in that definition.

## Exercises

### Exercise 1: Observing Binary Search

a. Verify that binary search can correctly find the entry for "Heather" in `objects-by-name`.

b. Verify that binary search can correctly find the entry for an object of your choice in `objects-by-name`.

c. Verify that binary search can correctly find the first entry in `objects-by-name`. You will need to supply the name associated with that entry.

d. Verify that binary search can correctly find the last entry in `objects-by-name`. You will need to supply the name associated with that entry.

e. Verify that binary search terminates and returns -1 for something that would fall in the middle of the vector and is not there. That is, pick a name that starts with M or N and that does not appear in the vector.

f. Verify that binary search terminates and returns -1 for something that comes before the first entry in `objects-by-name`. You will need to pick a name that alphabetically precedes `"Amy"`.

g. Verify that binary search terminates and returns -1 for something that comes after the last entry. You will need to pick a name that alphabetically follows `"Zed"`.

### Exercise 2: Counting Recursive Calls

It is often useful when exploring a recursive algorithm to observe the steps the algorithm performs. In Scheme, we can sometimes observe steps in recursive calls by inserting code to display the parameters of the procedure at each recursive call.

a. Add calls to `display` and `newline` to the definition of `binary-search`, so that it prints out the values of `lower-bound` and `upper-bound` each time the kernel procedure is called.

b. Redo steps a-g and report on the number of steps each search took.

c. Optional: You might also use `define\$` and `analyze` to do the counting for you.

### Exercise 3: Duplicate Keys

a. What do you expect binary search do if there are entries with duplicate keys?

b. Add two more entries with a key of `"Otto"` and two more entries with a key of `"Amy"`.

c. Which of the three do you expect binary search to return if you search for Otto?

e. Which of the three do you expect binary search to return if you search for Amy?

g. What does your experience in this exercise suggest about what binary search will do with multiple keys?

### Exercise 4: Searching by Width

As you may have observed, `objects-by-width` contains the same twenty-six objects as in `object-by-name`, but with the objects organized by their width, rather than by name.

a. Write an expression to find an object with a width of 45.

b. Write an expression to find an object whose width is 40.

c. What, if anything, can you say about what binary search does when searching for a key that appears more than once in the vector?

### Exercise 5: Binary Search, Revisited

It is sometimes useful to learn not just that something is not in the vector, but where it would fall if it were in the vector. Revise binary-search (both the code and the documentation) so that it returns a "half value" if the value being searched for belongs between two neighboring values. For example, if the key being searched for is larger than the key at position 5 and smaller than the key at position 6, you should return 5.5. Similarly, if the key being searched for is smaller than the key at position 0, you should return -1/2. If the key being searched for is bigger than the largest key, return `(- (vector-length vec) 0.5)`.

## For Those With Extra Time

### Extra 1: Counting Widths

As you may recall from Exercise 4, when searching for objects by width, we got only one of a variety of objects with a particular width. In some cases, it's useful to find not just some object of a particular width, but how many objects have that width, or have a width of that size or less.

a. One technique for finding out how many objects have a certain width or less is to step through the vector until we find the first object wider than the desired width. Using this technique, write a procedure, ```(no-wider-than objects width)```, that, given a vector of objects sorted by width, finds the number of objects that are no wider than `width`.

b. A more efficient way to find the position is to use a variant of binary search. Once again, our goal is to find the first object wider than the desired width. However, this time we use binary search to find that object. That is, you look in the middle. If the middle object is wider than the desired width, recurse on the left half, but also include the middle element since that may be the one we're looking for. If the middle object is not wider than the desired width, recurse on the right half. Write a new version of `no-wider-than` that uses this technique.

Note: For part b, note that you cannot call the binary-search procedure directly. Rather, you will need to use it as a template for your code. That is, copy the procedure and then make modifications as appropriate.

### Extra 2: Finding the First Element with a Particular Key

As you've observed, when a key is repeated, binary search picks one value with that key, but not necessarily the first value with that key. We might want to write a variant, `new-binary-search`, that uses the ideas of binary search to find the first element in the vector that contains the given key.

a. One strategy for implementing `new-binary-search` is to find some value with the given key (which binary search already does) and then to step left in the vector until you find the first value with that key. Implement `new-binary-search` using that strategy.

b. Of course, we use the binary search technique so that we don't have to step through elements one-by-one. Rewrite `new-binary-search` so that it continues to “divide and conquer” in its attempt to find the first element with that key.

## Some Useful Code

```;;; Procedure:
;;;   binary-search
;;; Parameters:
;;;   vec, a vector to search
;;;   get-key, a procedure of one parameter that, given a data item,
;;;     returns the key of a data item
;;;   may-precede?, a binary predicate that tells us whether or not
;;;     one key may precede another
;;;   key, a key we're looking for
;;; Purpose:
;;;   Search vec for a value whose key matches key.
;;; Produces:
;;;   match, a number.
;;; Preconditions:
;;;   The vector is "sorted".  That is,
;;;     (may-precede? (get-key (vector-ref vec i))
;;;                   (get-key (vector-ref vec (+ i 1))))
;;;     holds for all reasonable i.
;;;   The get-key procedure can be applied to all values in the vector.
;;;   The may-precede? procedure can be applied to all pairs of keys
;;;     in the vector (and to the supplied key).
;;;   The may-precede? procedure is transitive.  That is, if
;;;     (may-precede? a b) and (may-precede? b c) then it must
;;;     be that (may-precede? a c).
;;;   If two values are equal, then each may precede the other.
;;;   Similarly, if two values may each precede the other, then
;;;     the two values are equal.
;;; Postconditions:
;;;   If vector contains no element whose key matches key, match is -1.
;;;   If vec contains an element whose key equals key, match is the
;;;     index of one such value.  That is, key is
;;;       (get-key (vector-ref vec match))
(define binary-search
(lambda (vec get-key may-precede? key)
; Search a portion of the vector from lower-bound to upper-bound
(let search-portion ((lower-bound 0)
(upper-bound (- (vector-length vec) 1)))
; If the portion is empty
(if (> lower-bound upper-bound)
; Indicate the value cannot be found
-1
; Otherwise, identify the middle point, the element at that
; point and the key of that element.
(let* ((midpoint (quotient (+ lower-bound upper-bound) 2))
(middle-element (vector-ref vec midpoint))
(middle-key (get-key middle-element))
(left? (may-precede? key middle-key))
(right? (may-precede? middle-key key)))
(cond
; If the middle key equals the value, we use the middle value.
((and left? right?)
midpoint)
; If the middle key is too large, look in the left half
; of the region.
(left?
(search-portion lower-bound (- midpoint 1)))
; Otherwise, the middle key must be too small, so look
; in the right half of the region.
(else
(search-portion (+ midpoint 1) upper-bound))))))))

(define objects-by-name
(vector
(list "Amy" "ellipse" "blue" 90 50 25 5)
(list "Bob" "ellipse" "indigo" 80 40 35 30)
(list "Charlotte" "rectangle" "blue" 0 40 5 45)
(list "Danielle" "rectangle" "red" 0 140 35 15)
(list "Devon" "rectangle" "yellow" 80 0 10 5)
(list "Erin" "ellipse" "orange" 60 50 10 15)
(list "Fred" "ellipse" "black" 0 110 30 30)
(list "Greg" "ellipse" "orange" 110 10 35 50)
(list "Heather" "rectangle" "white" 100 140 35 50)
(list "Ira" "ellipse" "red" 100 100 5 50)
(list "Janet" "ellipse" "black" 60 70 5 20)
(list "Karla" "ellipse" "yellow" 20 110 25 10)
(list "Leo" "rectangle" "yellow" 60 40 30 50)
(list "Maria" "ellipse" "blue" 30 10 5 50)
(list "Ned" "rectangle" "yellow" 0 50 45 15)
(list "Otto" "rectangle" "red" 100 40 10 20)
(list "Paula" "ellipse" "orange" 100 20 50 25)
(list "Quentin" "ellipse" "black" 40 130 35 50)
(list "Rebecca" "rectangle" "green" 110 70 25 35)
(list "Sam" "ellipse" "white" 20 120 35 40)
(list "Ted" "rectangle" "black" 20 0 10 20)
(list "Urkle" "rectangle" "indigo" 40 110 10 5)
(list "Violet" "rectangle" "violet" 80 80 50 20)
(list "Xerxes" "rectangle" "blue" 60 130 25 35)
(list "Yvonne" "ellipse" "white" 40 110 50 40)
(list "Zed" "rectangle" "grey" 90 60 25 5)
))

(define objects-by-width
(vector
(list "Charlotte" "rectangle" "blue" 0 40 5 45)
(list "Ira" "ellipse" "red" 100 100 5 50)
(list "Janet" "ellipse" "black" 60 70 5 20)
(list "Maria" "ellipse" "blue" 30 10 5 50)
(list "Devon" "rectangle" "yellow" 80 0 10 5)
(list "Erin" "ellipse" "orange" 60 50 10 15)
(list "Otto" "rectangle" "red" 100 40 10 20)
(list "Ted" "rectangle" "black" 20 0 10 20)
(list "Urkle" "rectangle" "indigo" 40 110 10 5)
(list "Amy" "ellipse" "blue" 90 50 25 5)
(list "Karla" "ellipse" "yellow" 20 110 25 10)
(list "Rebecca" "rectangle" "green" 110 70 25 35)
(list "Xerxes" "rectangle" "blue" 60 130 25 35)
(list "Zed" "rectangle" "grey" 90 60 25 5)
(list "Fred" "ellipse" "black" 0 110 30 30)
(list "Leo" "rectangle" "yellow" 60 40 30 50)
(list "Bob" "ellipse" "indigo" 80 40 35 30)
(list "Danielle" "rectangle" "red" 0 140 35 15)
(list "Greg" "ellipse" "orange" 110 10 35 50)
(list "Heather" "rectangle" "white" 100 140 35 50)
(list "Quentin" "ellipse" "black" 40 130 35 50)
(list "Sam" "ellipse" "white" 20 120 35 40)
(list "Ned" "rectangle" "yellow" 0 50 45 15)
(list "Paula" "ellipse" "orange" 100 20 50 25)
(list "Violet" "rectangle" "violet" 80 80 50 20)
(list "Yvonne" "ellipse" "white" 40 110 50 40)
))
```

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Janet Davis, Matthew Kluber, and Samuel A. Rebelsky. (Selected materials copyright by John David Stone and Henry Walker and used by permission.)

This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit `http://creativecommons.org/licenses/by-nc/2.5/` or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.