Class 26: More Efficient Sorting Algorithms

Back to Sorting Algorithms. On to Project Discussion.

Held Friday, October 8, 1999

Overview

Today we continue our discussion of sorting by visiting some more efficient sorting algorithms that use divide and conquer as their underlying design strategy.

Notes

• Exam 2 is now ready. Please bring questions to class on Monday.
• Are there any final questions on assignment 3?
• Unless I hear serious objections, I'm dropping the extra credit from the assignment.
• We may have parents visiting today.
• Please email me your classes by 9 a.m. on Monday so that I can put them online in time for our discussion.
• For many of the sorting algorithms we're discussing, you will see three different presentations. I have intentionally presented these different versions because I want you to focus on the underlying principles and I think you do that better if you see multiple implementations. Here are the variations.
• The presentation in the book (typically, as a static method that sorts its parameter)
• The presentation in the outline (typically, as a method in a subclass of `Array`)
• The presentation in class (typically, at a higher level)
• Yes, I am intentionally presenting the algorithms somewhat differently in class from how they appear in the book.

Contents

Handouts

Summary

• Assigned:
• Exam 2 (due 10 a.m. Friday, October 15, 1999)

Sorting with Divide and Conquer

• In the previous class, we identified a number of interesting sorting algorithms which took O(n2) time.
• Can we do better? Well, sometimes using divide-and-conquer helps speed up algorithms (in our experience from O(n) to O(log2n)).
• We'll look at two different ways of ``splitting'' the array.

Merge Sort

• We can develop a number of sorting techniques based on the divide and conquer technique. One of the most straightforward is merge sort.
• In merge sort, you split the list, array, collection or ... into two parts, sort each part, and then merge them together.
• Unlike the previous algorithms, merge sort requires extra space for the sorted arrays or subarrays.
• We'll write this as a non-inplace routine which returns a new, sorted array, rather than sorting the existing array.
• In approximate Java code,
```
import SimpleOutput;

/**
* A collection of techniques for sorting an input array.
*
* @author Samuel A. Rebelsky
* @version 1.0 of October 1999
*/
public class MergeSorter
{
// +--------+--------------------------------------------------
// | Fields |
// +--------+

/** The current indent level.  Used when logging steps. */
String indent = "";

// +----------------+------------------------------------------
// | Public Methods |
// +----------------+

/**
* Sort an array, creating a new sorted version of the array.
* If the SimpleOutput object is non-null, prints a simple log
* of what's happening.
* Pre: The elements in the array can be compared to each other.
* Pre: There is sufficient memory to complete the creation of the
*   new array (and the other steps of the algorithm).
* Post: Returns a sorted version of the array (where sorted is
*   defined carefully elsewhere).
* Post: Does not affect the original array.
*/
public Object[] sort(Object[] stuff,
Comparator compare,
SimpleOutput observer)
throws IncomparableException
{
return mergeSort(stuff, 0, stuff.length-1, compare, observer);
} // sort(Object[])

// +----------------+------------------------------------------
// | Helper Methods |
// +----------------+

/**
* Sort part of an array, creating a new sorted version of the
* part of the array.
* Pre: The elements in the array can be compared to each other.
* Pre: There is sufficient memory to complete the creation of the
*   new array (and the other steps of the algorithm).
* Post: Returns a sorted version of the array (where sorted is
*   defined carefully elsewhere).
* Post: Does not affect the original array.
*/
protected Object[] mergeSort(Object[] stuff,
int lb, int ub,
Comparator compare,
SimpleOutput observer)
throws IncomparableException
{
Object[] sorted;		// The sorted version
int middle;			// Index of middle element
// Print some basic information.
if (observer != null) {
observer.print(indent + "Sorting: ");
printSubArray(stuff, lb, ub, observer);
indent = indent + "  ";
}
// Base case: vector of size 0 or 1.  Make a fresh copy so that
// it's safe to modify (and is the appropriate size.
if (ub <= lb) {
sorted = copySubArray(stuff, lb, ub);
} // base case
// Recursive case: split and merge
else {
// Find the middle of the subarray.
middle = (lb + ub) / 2;
// Sort the two halves.
Object[] left = mergeSort(stuff, lb, middle, compare, observer);
Object[] right = mergeSort(stuff, middle+1, ub, compare, observer);
sorted = merge(left, right, compare);
} // recursive case
// Print information, if appropriate
if (observer != null) {
indent = indent.substring(2);
observer.print(indent + "Sorted: ");
printSubArray(sorted, 0, sorted.length-1, observer);
}
// That's it.
return sorted;
} // mergeSort(Object[], int, int, Comparator)

/**
* Merge two sorted arrays into a new single sorted array.
* Pre: Both vectors are sorted.
* Pre: Elements in both vectors may be compared to each other.
* Pre: There is sufficient memory to allocate the new array.
* Post: The returned array is sorted, and contains all the
*   elements of the two arrays (no more, no less).
* Post: The two arguments are not changed
*/
public Object[] merge(Object[] left, Object[] right, Comparator compare)
throws IncomparableException
{
// Create a new array of the appropriate size.
Object[] result = new Object[left.length + right.length];
// Create indices into the three arrays.
int leftIndex=0;	// Index into left array.
int rightIndex=0;	// Index into right array.
int index=0;		// Index into result array.
// As long both vectors have elements, copy the smaller one.
while ((leftIndex < left.length) && (rightIndex < right.length)) {
if(compare.lessThan(left[leftIndex],right[rightIndex])) {
result[index++] = left[leftIndex++];
} // first element in left subvector is smaller
else {
result[index++] = right[rightIndex++];
} // first element in right subvector is smaller
} // while both vectors have elements
// Copy any remaining parts of each vector.
while(leftIndex < left.length) {
result[index++] = left[leftIndex++];
} // while the left vector has elements
while(rightIndex < right.length) {
result[index++] = right[rightIndex++];
} // while the right vector has elements
// That's it
return result;
} // merge

/**
* Copy a subarray (so that we can return it without affecting it).
* Pre: 0 <= lb <= ub < stuff.length
* Post: Does not affect stuff.
* Post: Returns a new array containing only stuff[lb] .. stuff[ub].
*/
protected Object[] copySubArray(Object[] stuff, int lb, int ub) {
// Create the new array.
Object[] result = new Object[ub-lb+1];
for (int i = lb; i <= ub; i++) {
result[i-lb] = stuff[i];
}
return result;
} // copySubArray

/**
* Print a subarray.
* Pre: 0 <= lb <= ub < stuff.length
* Post: Does not affect stuff.
*/
protected void printSubArray(Object[] stuff,
int lb, int ub,
SimpleOutput out) {
// Print all but the last element followed by a comma
for (int i = lb; i < ub; ++i) {
out.print(stuff[i].toString() + ",");
}
// Print the last element
out.println(stuff[ub]);
} // printSubArray
} // MergeSorter

```
• Here's the corresponding test class.
```
import MergeSorter;
import SimpleOutput;
import StringComparator;

/**
* A simple test of selection sort.
*
* @author Samuel A. Rebelsky
* @version 1.0 of September 1999
*/
public class TestMergeSorter {
public static void main(String[] args)
throws Exception
{
SimpleOutput out = new SimpleOutput();
MergeSorter sorter = new MergeSorter();
Object[] sorted = sorter.sort(args, new StringComparator(), out);
for (int i = 0; i < sorted.length; ++i) {
out.println(i + ": " + sorted[i]);
} // for
} // main(String[])
} // class TestMergeSorter

```

Running Time

• What is the running time?
• We can use recurrence relations:
• Let f(n) be the running time of merge sort on input of size n.
• f(1) = 1
• f(n) = n + 2*f(n/2)
• Let's run this for a few steps
• f(n)
• = n + 2*f(n/2)
• = n + 2*f(n/2)
• = n + 2*(n/2 + 2*f(n/2/2))
• = n + n + 4*f(n/4)
• = n + n + 4*(n/4 + 2*f(n/4/2))
• = n + n + n + 8*f(n/8)
• Generalizing
• f(n) = kn + 2k*f(n/2k)
• When k = log2n
• f(n) = nlog2n + n*1
• Therefore, f(n) is in O(nlog2n).

Running Time, Revisited

• We can also use a somewhat nontraditional analysis technique.
• Assume that we're dealing with n = 2x for some x.
• Consider all the sorts of `Array`s of size k as the same "level".
• There are n/k `Array`s of size k.
• Because we divide in half each time, there are log_2(n) levels.
• Going from level k to level k+1, we do O(n) work to merge.
• So, the running time is O(n*log_2(n)).

A Problem

• Unfortunately, merge sort requires significantly more memory than do the other sorting routines (you can spend some time trying to come up with an ``in place'' merge sort, but you are quite likely to fail).

Other Versions

• We've written merge sort so that it does not affect the original array.
• How might we write it so that it creates a sorted array?
• Will that save space?
• How might we write it like the previous sorting methods (which were for subclasses of our `Array` class)?

Quicksort

• Is it possible to write an O(n*log2n) sorting algorithm that is based on comparing and swapping, but doesn't require significantly extra space?
• Yes, if you're willing to rely on probabilities.
• In the Quicksort algorithm, you split (partition) the array to be sorted into two pieces, those smaller than or equal to the pivot and those greater than the pivot. You don't include the pivot in either piece (so that the recursive case is ``smaller'').
• You can also split into three parts: those smaller than some middle element, those equal to some middle element, and those larger than some middle element.
• With a little work, you can do this partitioning in place, so that there is no overhead (and so that ``glueing'' is basically a free operation).
• (It is not necesarrily okay to partition the array into two parts: those less than or equal to the element, and those greater than or equal to the element.)
```/**
* Sort an array using Quicksort.
* Pre: All elements in the array can be compared to each other.
* Post: The vector is sorted (using the standard meaning).
*/
public void quickSort(Object[] stuff) {
quickSort(stuff, 0, stuff.length-1);
} // quickSort(Object[])

/**
* Sort part of an array using Quicksort.
* Pre: All elements in the subarray can be compared to each other.
* Pre: 0 <= lb <= ub < stuff.length
* Post: The vector is sorted (using the standard meaning).
*/
public void quickSort(Object[] stuff, int lb, int ub, Comparator compare) {
// Variables
Object pivot;	// The pivot used to split the vector
int mid;		// The position of the pivot
// Base case: size one arrays are sorted.
if (lb == ub) return;
// Pick a pivot and put it at the front of the array.
putPivotAtFront(stuff,lb,ub);
// Determine the position of the pivot, while rearranging the array.
mid = partition(stuff, lb, ub);
// Recurse on nonempty subarrays.
if (lb<=mid-1) quickSort(stuff, lb,mid-1);
if (mid+1<=ub) quickSort(stuff, mid+1,ub);
} // quickSrt

/**
* Split the array given by [lb .. ub] into ``smaller'' and
* ``larger'' elements, where smaller and larger are defined by
* their relationship to a pivot.  Return the index of the pivot
* between those elements.  Uses the first element of the array
* as the pivot.
*/
public int partition(Object[] stuff, int lb, int ub) {
// STUB.  Can you figure it out?
return 0;
} // partition(Object[])
```
• What is the running time of Quicksort? It depends on how well we partition. If we partition into two equal halves, then we can say
• Partitioning a vector of length n takes O(n) steps.
• The time to partition those two partitions into four parts is also O(n).
• If each partition is perfect (splits it exactly in half), we can stop the process after O(log_2(n)) levels.
• This gives a running time of O(n*log_2(n)).
• On average, we don't quite do half, but it's close enough that it doesn't make a significant difference.
• However, bad choice of pivots can give significantly worse running time. If we always chose the largest element as the pivot, this algorithm would be equivalent to selection sort, and would take time O(n*n).
• How do we partition? Typically, using something like the following strategy:
```Set pivot to the first element of the subarray
Set left to the start of the subarray
Set right to the end of the subarray
Move left and right toward each other,
Swap their contents when you observe that one side is
"wrong" (something on the left is larger than the pivot,
something on the right is larger than the pivot)
```

Variations

• How might you change Mergesort so that the pivot need not be an element of the array?
• How might you rewrite Mergesort iteratively?

Better Sorting Techniques

• Are there better sorting techniques (ones that take less than O(nlog2n))? Not if you limit yourself to basic operations of comparing and swapping.
• But if you're willing to use extra space and know something about the original data, then you can do better.
• In bucket sort, you create separate ``buckets'' for kinds of elements, put each element into the appropriate bucket, sort each bucket, and then take them out again.
• If you can arrange things so that each bucket contains only a few elements (say no more than four), then the main cost is putting in to buckets and taking out of buckets.
• Usually we choose simple criteria for the buckets, such as the first (or nth) letter in a string.
• Running time can be a constant (as long as you can guarantee the number of items in any bucket)!
• In radix sort, we sort using a binary representation of the things we're sorting. We'll do an example later.

History

Tuesday, 10 August 1999

• Created as a blank outline.

Wednesday, 6 October 1999

• Filled in some details based on outline 21 of CS152 99S.
• Reformatted.
• Corrected and tested code for merge sort. Added reporting.

Thursday, 7 October 1999

Friday, 8 October 1999

• Updated introductory notes.
• Added recurrence relation for merge sort.

Back to Sorting Algorithms. On to Project Discussion.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.