CSC151 2009F, Class 44: Binary Search
Admin:
* I'm a bit disappointed with the increasing number of people missing
class.
* Missing when you're sick is acceptable.
* Missing for other reasons is not so acceptable.
* But whatever reason makes you miss, you have a responsibility to
notify me.
* I have been trying to acknowledge the stressful part of the semester
but cutting work.
* There is no reading for Friday.
* Use the extra time to work on your projects.
* Project proposals due today.
* Sketches
* Reminder: Projects are due next Tuesday.
* EC for
* Thursday's convocation.
* Thursday's CS Extra on Bioinformatics (Thursday at 4:15, 3821).
* Friday's CS Table on "Computational Thinking" (noon, PDR)
* Readings available in class today.
* Friday night's swim meet.
* As You Like It
* Chamber Ensemble Saturday at 4pm
* Amazing Percussion plus Flute ensemble Friday night
Overview:
* The problem of searching.
* Analyzing algorithmic efficiency.
* Making searching more efficient using divide-and-conquer.
Recent topic in class: Association Lists
* Basic idea: Given a list of values, search for a particular value
* Requires a particular arrangement of your data
* List of values (vs. a vector or tree)
* Each value must be a list or pair
* The part of the value used for searching is the car
* We can extend: Instead of looking at the car, we can look at anyh
element
* Yay recursion!
* We have a straightforward and seemingly correct solution to the
problem of searching.
* Question: Can we make searching more efficient?
* Basic recursion-based searching in lists:
* Requires you to look at each element, in turn, until you find it
or run out of elements.
* "On average", about N/2 elements (N = (length lst)) to look at
* Can we make this better (as we do in searching phone books)?
* "Find the first letter of the last name"
* THen look one by one
* Can we generalize?
* Look in the middle
* Magically, the thing you're looking for is in the middle
* The thing in the middle comes after the thing you're looking for
Throw away the second half
Start all over (recurse!)
* The thing in the middle comes before the thing you're looking for
Throw away the first h alf
Start all over (recurse!)
Is this really any better than "look at each thing in turn"?
* Yes, we're throwing away massive amounts of stuff
* Suppose we were dealing with the NYC phone directory
8 million
4 million
2 million
1 million
500 K
250 K
125 K
64 K
32 K
16 K
8 K
4 K
2 K
1 K
500
250
125
63
32
16
8
4
2
1
* About log_2(N) steps
Requirements:
* The thing we're seaching must already be sorted by the key
* It needs to be easy to find the middle element
* And to throw away half
* Won't work for lists: Can't find the middle element quickly
* But it's quick to find the middle element of a vector
* How do we throw away half a vector quickly?
* We just keep track of the positions of the portion of the vector
of interest
*
Suppose we're writing binary search
(define binary-search
(lambda (value-we-are-looking-for
the-vector-we-are-looking-through
a-procedure-that-gives-back-the-key-of-an-entry
a-way-to-compare-keys)
(let kernel ((starting-position-of-range-of-interest 0)
(ending-position-of-range-of-interest (- (vector-length vec) 1)))
...