The Tao of Computing:
A Down-to-Earth Approach to Computer Fluency
 
by Henry M. Walker Jones and Bartlett Publishers
 

Laboratory Exercise on Run-Time Experiments for Searching Algorithms

Elements of Efficiency

Although computers are often considered to work very quickly, only some algorithms proceed rapidly; others take much longer. This laboratory exercise provides an intuitive framework for the consideration of algorithm effectiveness, including the amount of time and memory required for an algorithm. When algorithms are applied to successively larger data sets, some algorithms may scale up nicely while others require more time than may be feasible. Also, when several algorithms are available to solve a problem, it is natural to wonder if one solution were better than another. Altogether, we might use many criteria to evaluate such solutions, including:

For this laboratory exercise, we focus on algorithm execution time.

Algorithm Execution Time

In determining algorithm execution time, we may proceed in several ways:

Each of these approaches has advantages, but each also has drawbacks. Execution times on a specific machine normally depend upon details of the machine and on the specific data used. Timings may vary from data set to data set and from machine to machine, so experiments from one machine and one data set may not be very helpful in general.

The analysis of instructions may take into account the nature of the data -- for example, one might consider what happens in a worst case. Also, such analysis commonly is based on the size of the data being processed -- the number of items or how large or small the data are. This is sometimes called a microanalysis of program execution. Once again, however, the specific instructions may vary from machine to machine, and detailed conclusions from one machine may not apply to another.

A high-level analysis may identify types of activities performed, without considering exact timings of instructions. This is sometimes called a macroanalysis of program execution. This can give a helpful overall assessment of an algorithm, based on the size of the data. However, such an analysis cannot show fine variations among algorithms or machines.

For many purposes, it turns out than a high-level analysis provides adequate information to compare algorithms. For the most part, we follow that approach here.

Linear and Binary Searches

In class, we have discussed both the linear search and the binary search.

  1. Write a paragraph or two description of how a linear search works.

  2. Write a paragraph or two description of how a binary search works.

In class, we have discussed how efficient each of these algorithms might be. Now we take a more concrete, experimental approach.

Experiments Regarding Search

We will search for random items in a sequence of even integers:

0, 2, 4, 6, ..., maximum

Technically, the structure holding such a collection of data is called an array. With the array data just described, a search should be successful if we are looking for an even, nonnegative integer that is not too large. The search should fail if the desired integer is negative, odd, or too large. In our experiment, we will pick 20 integers at random as candidates for the search. We pick the first 10 integers, so the search will be successful, and the last 10 so the search will fail.

The program searchTest performs both a linear search and a binary search for randomly-selected items, and records the time required for this work. Due to the speed of the computer and the limitations of the clocking mechanism available, we repeat each experiment 100,000 times. This means that times are magnified, so we can easily see differences.

The mechanics of running these search experiments are as follows:

In this part of the lab, you are to gather experimental data regarding the search times for various size arrays for both the linear and binary searches. Since the searchTest program performs 20 trials, you will need to combine the results in some way. It is suggested that you average the results for the 10 trials that succeed and compute a second average for the 10 trials that fail.

  1. Run searchTest for a variety of array sizes between 1000 and 50,000.

    Use a spreadsheet to record the times required for each trial and for each searching method. Use separate columns to record times for the sample size, the search time for a trial for the linear search, the search time for the binary search. Maintain separate statistics for trials in which the item was found and for trials was not found.

  2. Use the spreadsheet to tabulate the average times for each algorithm for each array size -- with one time for when an item is found and a second time for when the item is not found. Organize this work in a separate part of the spreadsheet:

    1. One simple table should show sizes and times for successful searches with the linear search.
    2. One table should show sizes and times for unsuccessful searches with the linear search.
    3. One table should show sizes and times for successful searches with the binary search.
    4. One table should show sizes and times for unsuccessful searches with the binary search.
  3. Use the spreadsheet to plot the sizes and times for the four tables as separate graphs. The horizontal axis on the graph paper should indicate the size of the array, and the vertical axis should indicate time. You should conduct sufficient experiments, so that a fairly consistent pattern emerges.

  4. Describe (in words) the nature of the graphs you have observed.

Other Experiments for Searching

Your experimental results, of course, relied on a particular program running on a specific machine. Actual numbers are likely to vary from one computer and program to another. However, we still would anticipate the same general patterns -- even if the numbers differ.

The following table gives experimental measurements for the average time required for a linear search for several search trials on another machine with another program.

Array Average Time Average Time If
Size If Value Found Value Not Found
1000 620 1248
2000 1260 2490
4000 2540 4960
  1. Estimate the time for an average linear search of arrays of size 1500, 3000, 8000, and 16000. Briefly justify your answers.

Continuing this experiment, the following table gives experimental measurements for the average time required for a binary search for several search trials.

Array Average Time Average Time If
Size If Value Found Value Not Found
1000 33 33
2000 37 37
4000 41 41

  1. Estimate the time for an average binary search of arrays of size 1500, 3000, 8000, and 16000. Briefly justify your answers.

Work To Be Turned In



This laboratory exercise coordinates with Chapter 6 of Walker, Henry M., The Tao of Computing: A Down-to-earth Approach to Computer Fluency, Jones and Bartlett, March 2004.


This document is available on the World Wide Web as
    http://www.cs.grinnell.edu/~walker/fluency-book/labs/run-time-searching.shtml

created 31 December 2003
last revised 27 February 2006

Valid HTML 4.01! Valid CSS!