[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Examples] [Book] [Tutorial] [API]

In this laboratory session, you will investigate a number of related
algorithms. Initially, you will consider some algorithms used to
find the smaller elements of an array. You will then consider ways
to enhance these algorithms in order to *sort* arrays (place
the elements in order).

The goals of this laboratory session are to:

- investigate a number of key algorithms, particularly important sorting algorithms;
- consider the effects of the constants in running times;
- consider the effects of more significant improvements to algorithm design;
- detour into the generation of random sequences; and
- further your skills at analyzing algorithms.

Your instructor will tell you which of the proposed experiments you are to perform.

**Prerequisite skills:**

- Arrays
- Loops
- Recursion
- Timing

**Required files:**

If you've done any reading about algorithm analysis, you've learned that
computer scientists tend to analyze algorithms in terms of an upper
bound on their expected or worst-case running times, and that they
express those running times in terms of an unknown constant times a
function of the size of the input. Such running times are written
O(*the function*) and pronounced ``big O of *the
function*'' or ``order *the function*''.

For example, if we were deleting the smallest element of an array, we might say that it takes O(n) time, where n is the number of elements in the array. Why? There may be cases in which it takes less. For example, if we know the smallest element is at the end of the array, then we can probably delete it in one step. But how do we determine that it's the smallest element? Usually by comparing it to all the other elements. In addition, if we don't want to leave gaps in the array, we may need to shift all the elements left one space. If we end up deleting the leftmost element, that's another n ``steps''.

When we say that a method requires O(f(n)) steps, we mean that there is
some constant, `c`

such that the number of steps for an input
of size n is never more than (but sometimes less than) c*f(n), no matter
how we count as steps. (The choice of c may depend on our definition
of ``step''.) Different methods with the same big-O running time may have
very different constants. At the same time, choice of a different
algorithm with a ``smaller'' function can have a much bigger impact on
the actual running time of an algorithm, even when the smaller function
has a larger constant.

In the following discussion and subsequent experiments, we will investigate these issues in more depth.

Suppose you were asked to find the smallest element of a sequence. You might do this be assuming that the first element is the smallest and then stepping through the remaining elements, updating your estimate of the smallest whenever you found a smaller element. In pseudocode,

guess = the first element of the sequence for each remaining element of the sequence, e if (e < guess) then guess = e; end if end for

Using arrays in Java, we might express this as

/** * Compute the smallest element in the sequence. */publicintsmallest() {// Our guess as to the smallest elementintguess =this.elements[0];// A counter variableinti;// Look through all subsequent elementsfor(i = 1; i <this.elements.length; ++i) {// If the element is smaller than our guess, then// update the guessif(this.elements[i] < guess) { guess =this.elements[i]; } // if } // for// That's it, we're donereturnguess; } // smallest()

As a variation, we might write an `indexOfSmallest`

that
returns the index of the smallest element in a subsequence? Why
would we want such a method? As you've seen, whenever your write a
method for a sequence, it is helpful to write a similar method for
a subsequence. Why return an index rather than the actual value?
Because it will be helpful for the subsequent experiments.

If we decided to generalize this, we might change it to an

/** * Compute the index of the smallest element in the subsequence * given by lower bound lb and upper bound ub. */publicintindexOfSmallest(intlb,intub) {// Make sure the upper bound and lower bound are reasonable.if(lb < 0) { lb = 0; }if(ub >=this.elements.length) { ub =this.elements.length - 1; }// Our guess as to the index of the smallest elementintguess = lb;// A counter variableinti;// Look through all subsequent elementsfor(i = lb + 1; i <= ub; ++i) {// If the element is smaller than our guess, then// update the guessif(this.elements[i] <this.elements[guess]) { guess = i; } // if } // for// That's it, we're donereturnguess; } // indexOfSmallest()

In experiment L3.1 you will investigate these two methods in a little more depth. You will also begin to examine the three classes you will use in the remaining experiments.

Suppose you were instead asked to find the two smallest entries in a sequence. One question you might ask would be ``How should I return two values?'' So that we need not concern ourselves with that question, let us instead try to move the two smallest entries to the first two positions of the array.

One approach would be to look through the sequence to find the smallest
entry and move it to the front of the sequence, then look through all but
the first element of the modified sequence for the next smallest element.
Using the `indexOfSmallest`

method described above, we might
phrase this as

/** * Put the two smallest elements of the sequence at the beginning * of the sequence. The sequence must have at least two elements. */publicvoidtwoSmallest() {// Swap the initial element with the smallestswap(0, indexOfSmallest(0,this.length()-1));// Swap the next element with the smallest remainingswap(1, indexOfSmallest(1,this.length()-1)); } // twoSmallest()

As you might guess, `swap(i,j)`

swaps the elements at positions
i and j in the sequence.

Now, how might we put the five smallest elements in a sequence of 50 elements at the front of that sequence? One approach would be to comb through the sequence to find the smallest entry and move it to the front of the sequence. Next, you could comb through the 49 entries following this newly positioned entry to find the next smallest entry and move it to the position following the smallest entry. By repeating this process three more times, each time finding the smallest entry remaining in the sequence and placing it just behind the entry found in the previous pass, you will have placed the five smallest entries at the beginning of the sequence in increasing order of size.

When turning this narrative into code, it is appropriate to use a loop (since the five pieces are quite similar). For example,

/** * Put the five smallest elements of the array at the beginning of * the array (naive method). The sequence should have at least * five elements. */publicvoidfiveSmallest() {inti;// For each index i from 0 to 4,for(i = 0; i < 5; ++i) {// Swap the smallest element in [i .. last] with the ith element.swap(i, indexOfSmallest(i,this.length()-1)); } // for } // fiveSmallest()

What we have accomplished is a partial sorting of the sequence by *selecting*
the smallest entries. Thus, we call this algorithm the
*partial selection sort.*

Our task now is to analyze the efficiency of this approach. This we do in terms of the number of times two entries in the sequence are compared. To find and position the smallest entry in the sequence requires 49 comparisons, to process the next smallest entry requires 48, and so on. Thus, the total number of comparisons to find the five smallest entries is

49 + 48 + 47 + 46 + 45 = 235

In general, applying this selection method to find the k smallest entries in a sequence of n entries requires

(1/2)(2*n*k - k

^{2}- k)

comparisons. (Can you derive this formula?) Thus, to find the 10 smallest entries in a sequence of 10,000 entries requires 99,945 comparisons. We might also say that this is an O(n*k) algorithm.

Can we do better? Recall that when we found the smallest element in a sequence, we began with a guess of the smallest and then refined that guess by looking at the remaining elements. We can do the same thing to find the five smallest elements. Initially, we'll assume that the first five elements are the five smallest elements. Sort the first five entries in the sequence by any method. Then consider the sixth entry. Compare it to the fifth entry in the sequence, which is now the largest of the first five entries. If the sixth entry is larger, pass over it because it is not one of the five smallest entries. If, however, the sixth entry is smaller than the fifth, compare it with the fourth, third, and so on, inserting it among the first five entries so that the first five entries in the list remain the smallest entries found so far in increasing order. Repeat this process for the entries in positions 7, 8, ..., 50.

Note that this process creates a partially sorted list by inserting the
smallest entries into the beginning of the list. Thus, we call this
approach the *partial insertion sort*. To see why the partial insertion
sort is superior to the partial selection sort, let us compare the two
approaches when searching for the 10 smallest entries within a list of
1,000 entries. We suppose that our partial insertion sort has reached
the halfway point. The 10 smallest items in the first 500 have been
found and we are about to consider the entry in position 501. If the
original list was randomly scrambled, it is unlikely that this entry
will be less than the tenth entry, and only one comparison is required
to discover this. The same is true for all the entries in positions 501
through 1000. However, if one of these entries does belong among the top
10, then this will be discovered with one comparison, and at most nine
more comparisons will be required to position it properly. This is much
more efficient than our partial selection sort, in which each of the
last 500 entries is involved in 10 comparisons.

In experiment L3.2, you will investigate these two algorithms.

For our simple experiments, it is useful to be able to generate ``random''
sequences of numbers. What do we mean by ``random''? Typically, that
each sequence of a particular size is equally likely or that at each point
in the sequence, each number is equally likely as the next element. How can we generate
such sequences? Fortunately, Java provides a standard utility class,
`java.util.Random`

. This class includes a `nextInt`

method that gives that next ``random'' number in the sequence. In truth, this
number is not random, in that it is generated by an algorithm. However, it is
close enough to random for our purposes.

Hence, to fill the array `elements`

with a random sequence of 100
integers, we might write

importjava.util.Random; ...inti; Random generator =newRandom(); elements =newint[100];for(i = 0; i < 100; ++i) { elements[i] = generator.nextInt(); } // for

However, when we're comparing two algorithms, it is helpful to have the same
input to both algorithms. Fortunately, Java's random number generator can
take a *seed* that uniquely determines the random sequence. You can think
of a seed as being a number for the sequence. If you use the same seed, you
end up with the same sequence. For example, to get the ``first'' sequence,
you would write

Random generator =newRandom(1);

Note that random sequences are not always the best test cases for your algorithms. For example, when testing a sorting algorithm, you should also test sequences of varying lengths, sequences which contain all the same value, presorted sequences, and ``backwards'' sorted sequences (in which the numbers are organized largest to smallest). Nonetheless, random sequences still serve many purposes, and are often a good starting point.

In experiment L3.3 you will investigate random number generators.

By requesting the partial selection sort to find the n smallest entries in a list of n entries, we obtain an algorithm, known as the selection sort, for sorting an entire list. This algorithm first finds the smallest of the n entries of the list, requiring n - 1 comparisons, and places that entry at the top of the list. Next, the algorithm finds the smallest entry among the remaining n - 1 entries, requiring n - 2 comparisons, and moves it to the second position in the list. This process repeats until all the entries are in order. The entire process requires

(n - 1) + (n - 2) + ... + 2 + 1 = (n - 1)(n/2)

or

(1/2)(n

^{2}- n)

comparisons between list entries when sorting a list of length n. In
big-O notation, the running time is O(n^{2}).

A similar analysis shows that insertion sort requires an average of

(1/4)(n

^{2}- n)

comparisons to sort a list of n entries. Again, in big-O notation,
the running time is O(n^{2}).

In experiment L3.4, you will consider insertion sort. In experiment L3.5, you will consider selection sort. In experiment L3.6, you will compare the two.

In the 1960's C. A. R. Hoare, a pioneer in the field of computer science,
discovered the Quicksort algorithm.
In
the average case, the
number of comparisons performed by this algorithm when sorting a list of
n entries is O(n*lg(n)). However, in the worst case, Quicksort is also
O(n^{2}).

You will investigate the running time of Quicksort in experiment L3.4.

Name: ________________

ID:_______________

**Required files:**

**Step 1.**
Make copies of `Counter.java`

, `SortableIntSeq.java`

,
and `SortTester.java`

. Compile all three and execute
`SortTester`

. Find the smallest element in a list of size 50.
Describe what `SortTester`

does (or can do).

**Step 2.**
One problem with `SortTester`

and `SortableIntSeq`

is
that they do not provide an easy way to count the steps in an algorithm. How
should we do that? Preferably with a `Counter`

object. Read the
code for that class and explain what it does.

**Step 3.**
Build a new version of the `smallest`

method from `SortableIntSeq`

that takes a `Counter`

as a parameter and uses that counter to count the
steps it executes. Recompile `SortableIntSeq`

and correct any errors.
Summarize your changes.

**Step 4.**
Extend `SortTester`

so that it uses a `Counter`

to count
the steps in `SortableIntSeq`

's `smallest`

method.
Recompile `SortTester`

and correct any errors.
Summarize your changes.

**Step 5.**
Execute `SortTester`

and record the number of steps required to
find the smallest element in lists of size 10, 20, 100, and 1000.

10: 20: 100: 1000:After recording your results, you may want to look at our notes on this step.

**Required files:**

**Step 1.**
Make copies of `Counter.java`

, `SortableIntSeq.java`

,
and `SortTester.java`

. Compile all three and execute
`SortTester`

. Find the five smallest elements in a list of size 50.
Record the results.

**Step 2.**
Update `SortableIntSeq`

so that `fiveSmallest`

,
`newFiveSmallest`

, and any methods they use take
`Counter`

s as parameters and count their steps. Update
`SortTester`

to call those methods with a `Counter`

and print out the number of steps executed. Recompile both files and
correct any errors. Summarize your changes.

**Step 3.**
Use your modified `SortTester`

to fill in the following table.

Steps to find the smallest five elements in a list of size n, using naive partial selection sort and the better partial insertion sort.n steps steps(naive) (improved)50010002000

**Step 4.**
Update `SortableIntSeq`

and `SortTester`

to look for the
seven smallest elements, rather than the five smallest elements. Fill in the table.

Steps to find the smallest seven elements in a list of size n, using naive partial selection sort and the better partial insertion sort.n steps steps(naive) (improved)50010002000

**Step 5.**
Add `kSmallest`

and `newKSmallest`

methods to
`SortableIntSeq`

. These will behave like `fiveSmallest`

and `newFiveSmallest`

so that they
take `k`

(the number of small elements to find) as a parameter.
Recompile `SortableIntSeq`

and correct any errors. Summarize your changes.

**Step 6.**
Update `SortTester`

so that it reads in the number of elements to
find (in the cases in which we want k small elements). Recompile
`SortTester`

and correct any errors. Summarize your changes.

**Step 7.**
Using your augmented `SortTester`

, record the number of steps
for each of the following

Steps to find the smallest k elements in a list of size n, using naive partial selection sort and the better partial insertion sort.k n steps steps (naive) (improved)5 50010 100015 2000

**Step 8.**

Step 2. Repeat step 7 for the following table. In these cases you are selecting the top 10 percent of the list, while in step 7 you selected the top 1 percent.

Steps to find the smallest k elements in a list of size n, using naive partial selection sort and the better partial insertion sort.k n steps steps (naive) (improved)25 25050 500100 1000200 2000

**Step 9.**

Do those number match those theorized in the discussion? Why or why not?

**Step 10.**
What do you conclude about the advantages of one partial sort
over the other?

**Required files:**

**Step 1.**
Make copies of `SortableIntSeq.java`

and `SortTester.java`

.
Compile the two files. Using `SortTester`

, make five lists of ten
random numbers. Record those lists.

**Step 2.**
Update `SortTester`

to take a *seed* as an input. Use
that seed and the appropriate method of `SortableIntSeq`

to
use that seed. Recompile the files and correct any errors. Using
`SortTester`

, make three lists of ten random numbers, using the
same seed each time (do not use zero as our seed). Record your results.

**Step 3.**
At times, you will want to use presorted sequences instead of random sequences.
Read the code for `SortableIntSeq`

and determine which methods can
be used to generated presorted sequences. What command might you use to
create the sequence [1,3,5,7,9,...,301]? What command might you use to
create the sequence [301,299,297,...,5,3,1]?

**Required files:**

**Step 1.**
Make copies of `Counter.java`

, `SortableIntSeq.java`

,
and `SortTester.java`

. Compile all three and execute
`SortTester`

. Using `insertionSort`

, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.

**Step 2.**
Update `SortTester`

and `SortableIntSeq`

to count
the number of steps in insertion sort. Recompile the files and summarize
your changes.

**Step 3.**
Use insertion sort to sort ten randomly generated lists of 100 elements. Record
the number of steps in each case.

**Step 4.**
Is the number of steps always the same? Why or why not? After answering this
question you may want to read our notes on this step.

**Step 5.**
Use insertion sort to sort the lists

- [1,2,3,...,99,100]
- [2,4,6,...,198,200]
- [101,102,103,...199,200]

Record the number of steps in each case.

**Step 6.**
Is the number of steps always the same? Why or why not? After answering this
question you may want to read our notes on this step.

**Step 7.**
Use insertion sort to sort the lists

- [100,99,98,...,1]
- [200,198,196,...,4,2]
- [200,199,198,...,101]

Record the number of steps in each case.

**Step 8.**
Is the number of steps always the same? Why or why not? After answering this
question you may want to read our notes on this step.

**Step 9.**
Reflecting on your experiments, which types of lists is insertion sort best
at sorting? Worst at sorting? Why?
Is its running time on random lists closer to
the best time or worst?

**Required files:**

**Step 1.**
Make copies of `Counter.java`

, `SortableIntSeq.java`

,
and `SortTester.java`

. Compile all three and execute
`SortTester`

. Using `selectionSort`

, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.

**Step 2.**
If you read the code in `SortableIntSeq`

, you will see that
`selectionSort`

is not defined. Fill in the body appropriately,
recompile `SortTester`

, test the new `selectionSort`

,
and correct any errors. Enter the definition of
`selectionSort`

here. Note that you may want to look at the
definition of `fiveSmallest`

as you define `selectionSort`

.

**Step 3.**
Update `SortTester`

and `SortableIntSeq`

to count
the number of steps in selection sort. Recompile the files, correct any
errors, and summarize
your changes.

**Step 4.**
Use insertion sort to sort ten randomly generated lists of 100 elements. Record
the number of steps in each case.

**Step 5.**
Use insertion sort to sort the lists

- [1,2,3,...,99,100]
- [2,4,6,...,198,200]
- [101,102,103,...199,200]

Record the number of steps in each case.

**Step 6.**
Use insertion sort to sort the lists

- [100,99,98,...,1]
- [200,198,196,...,4,2]
- [200,199,198,...,101]

Record the number of steps in each case.

**Step 7.**
Is the number of steps always the same? Why or why not?

**Step 8.**
Reflecting on your experiments, which types of lists is selection sort best
at sorting? Worst at sorting? Why?
Is its running time on random lists closer to
the best time or worst?

**Required files:**

**Step 1.**
Using the modified versions of `SortableIntSeq`

and
`SortTester`

, fill in the following table. For each length
sequence, try three different random sequences. Make sure that the two
sorting mechanisms are run on the same random sequences.

Running time of insertion sort and selection sort on different length random sequences, with three tests per sequence length.Sequence Steps Steps length (insertion sort) (selection sort) Test1 Test2 Test3 Test1 Test2 Test31002004008002000

**Step 2.**
Using the modified `SortTester`

and `SortableIntSeq`

,
fill in the following table, using sequences of the form [1,2,3,...,n].

Running time of insertion sort and selection sort on different length increasing sequences, with one tests per sequence length. Sequence Steps Steps length (insertion sort) (selection sort) 100 200 400 800 2000

**Step 3.**
Using the modified `SortTester`

and `SortableIntSeq`

,
fill in the following table, using sequences of the form [n,n-1,n-2,...,3,2,1].

Running time of insertion sort and selection sort on different length decreasing sequences, with one tests per sequence length.Sequence Steps Steps length (insertion sort) (selection sort)1002004008002000

**Step 4.**
What do you observe from the tables above? Explain your findings.

**Required files:**

**Step 1.**
Make copies of `Counter.java`

, `SortableIntSeq.java`

,
and `SortTester.java`

. Compile all three and execute
`SortTester`

. Using `quickSort`

, sort a list
of ten numbers. Did it work correctly? Record the original list and the
sorted list.

**Step 2.**
Augment the classes to count the number of steps in Quicksort. Recompile
the files and correct any errors. Summarize your changes.

**Step 3.**
Run insertion sort and Quicksort on a few lists of different sizes,
recording the number of steps.

Running time of insertion sort and Quicksort on different length random sequences, with three tests per sequence length.Sequence Steps Steps length (insertion sort) (Quicksort) Test1 Test2 Test3 Test1 Test2 Test31002004008002000

**Step 4.**
Plot the results from the previous table.

**Step 5.**
Summarize your findings.

**Required files:**

**Step 1.**
Using the modified versions of `SortableIntSeq`

and
`SortTester`

, fill in the following table. For each length
sequence, try three different random sequences, one increasing
sequence, and one decreasing sequence.

Running time of Quicksort on different types and lengths of sequences.Sequence Steps length Rand1 Rand2 Rand3 Inc. Decr.1002004008002000

**Step 2.**
What do these results suggest?

a. Develop a `Person`

class in which each object contains
information about a person, including last name, telephone number,
city, state, and zip.

b. Write a program that sorts sequences of `Person`

objects.
You may want to use the `compareTo(String other)`

method from the `String`

class, which returns a negative number
if the current string is less than the other string.

c. Allow the user to select a "tie-breaking" field by which to distinguish between records that have the same value. For example, you might wish to have last-name ties sorted by first name within each group. Incorporate this tie-breaking method into your program.

d. Note any efficiency issues that arise while implementing these various sorting routines.

a. Develop a `PlayingCard`

class.

b. Develop a `Deck`

class, for decks of playing cards.

c. Create a `shuffle`

method that shuffles a deck of playing cards.

You might shuffle a deck by randomly selecting cards to swap, and doing that some appropriate number of times. Remember that you can use absolute value and the modulus operator to translate a number to a particular range.

You might also shuffle a deck by assigning a random number to each card and then sorting by those numbers.

a. Add a `sort`

method to `Deck`

class
that sorts the cards
in a deck into ascending order. Design your method to report the number of
comparisons performed during the sorting process.

b. Using the methods `shuffle`

and `sort`

, write a
program that reports statistics
(such as the number of comparisons per sort) over numerous shuffles of the deck.

c. How difficult would it be to change your sorting algorithm to, say, descending order, a different suit arrangement, or by sorting the deck into groups of similarly valued card groups?

Write a program implementing the sieve of Eratosthenes for finding the prime numbers between 1 and n. Apply your solution to various values of n. How does the time required by the program increase as n increases? Explain your findings.

**Experiment L3.1, Step 5.**
If you only count the number of times the body in the loop is executed, it
is likely that the number of steps in `smallest`

is one less than
the number of elements in the sequence.

**Experiment L3.4, Step 4.**
Since the insertions we have to do may differ from sequence to sequence,
it is likely that the running times will be different.

**Experiment L3.4, Step 6.**
While the lists are different, they are ordered the same. This means that
the number of swaps should be the same.

**Experiment L3.4, Step 8.**
While the lists are different, they are ordered the same. This means that
the number of swaps should be the same.

[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Examples] [Book] [Tutorial] [API]

**Disclaimer** Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

This page may be found at http://www.math.grin.edu/~rebelsky/Courses/CS152/99S/Labs/sorting.html

Source text last modified Tue Mar 2 09:25:21 1999.

This page generated on Tue Mar 2 11:18:01 1999 by SiteWeaver. Validate this page's HTML.

Contact our webmaster at rebelsky@math.grin.edu