# Class 21: More Efficient Sorting Algorithms

Back to Sorting Algorithms. On to Sorting Lab.

Held Monday, March 1

Summary

• Merge sort
• Quick sort
• Other sorting techniques

Contents

Notes

• It was a more confusing weekend than I'd thought, so I'll probably return the exams tomorrow (definitely not today).
• There's a chance I'll have my 7-month old in class tomorrow (just an advance warning: like many of you, he's been sick and it's hard to predict how he'll be)

## Sorting with divide-and-conquer

• In the previous class, we identified a number of interesting sorting algorithms (okay, two of 'em), both of which took O(n*n) time.
• Can we do better? Well, sometimes using divide-and-conquer helps speed up algorithms (in our experience from O(n) to O(log_2 n).
• We'll look at two different ways of ``splitting'' the array.

### Merge Sort

• We can develop a number of sorting techniques based on the divide and conquer technique. One of the most straightforward is merge sort.
• In merge sort, you split the list, array, collection or ... into two parts, sort each part, and then merge them together.
• Unlike the previous algorithms, merge sort requires extra space for the sorted arrays or subarrays.
• We'll write this as a non-inplace routine (keeping the original array "as is").
• In approximate Java code,
```/**
* Sort an array, creating a new sorted version of the array.
* Pre: The elements in the array can be compared to each other.
* Pre: There is sufficient memory to complete the creation of the
*   new array (and the other steps of the algorithm).
* Post: Returns a sorted version of the array (where sorted is
*   defined carefully elsewhere).
* Post: Does not affect the original array.
*/
public Comparable[] mergeSort(Comparable[] A) {
return mergeSort(A[], 0, A.length-1);
} // mergeSort(Comparable[])

/**
* Sort part of an array, creating a new sorted version of the
* part of the array.
* Pre: The elements in the array can be compared to each other.
* Pre: There is sufficient memory to complete the creation of the
*   new array (and the other steps of the algorithm).
* Post: Returns a sorted version of the array (where sorted is
*   defined carefully elsewhere).
* Post: Does not affect the original array.
*/
public Comparable[] mergeSort(Comparable[] A, int lb, int ub) {
int middle;			// Index of middle element
// Base case: vector of size 0 or 1.  Make a fresh copy so that
// it's safe to modify (and is the appropriate size.
if (ub <= lb) {
return copySubArray(A, lb, ub);
} // base case
// Recursive case: split and merge
else {
// Find the middle of the subarray.
middle = (lb + ub) / 2;
// Sort the two halves.
Comparable[] left = mergeSort(A, lb, middle);
Comparable[] right = mergeSort(A, middle+1, ub);
return merge(left, right);
} // recursive case
} // mergeSort(Comparable[], int, int)

/**
* Merge two sorted arrays into a new single sorted array.
* Pre: Both vectors are sorted.
* Pre: Elements in both vectors may be compared to each other.
* Pre: There is sufficient memory to allocate the new array.
* Post: The returned array is sorted, and contains all the
*   elements of the two arrays (no more, no less).
* Post: The two arguments are not changed
*/
public Comparable[] merge(Comparable[] left, Comparable[] right) {
// Create a new array of the appropriate size.
Comparable[] result = new Comparable[left.length + right.length];
// Create indices into the three arrays.
int left_index=0;	// Index into left array.
int right_index=0;	// Index into right array.
int index=0;		// Index into result array.
// As long both vectors have elements, copy the smaller one.
while ((left_index < left.length) && (right_index < right.length)) {
if(left[left_index].lessThan(right[right_index])) {
result[index++] = left[left_index++];
} // first element in left subvector is smaller
else {
result[index++] = right[right_index++];
} // first element in right subvector is smaller
} // while both vectors have elements
// Copy any remaining parts of each vector.
while(left_index < left.length) {
result[index++] = left[left_index++];
} // while the left vector has elements
while(right_index < right.length) {
result[index++] = right[right_index++];
} // while the right vector has elements
// That's it
return result;
} // merge
```
• What is the running time? We can use a somewhat clever analysis technique.
• Assume that we're dealing with n = 2^x for some x.
• Consider all the sorts of vectors of size k as the same "level".
• There are n/k vectors of size k.
• Because we divide in half each time, there are log_2(n) levels.
• Going from level k to level k+1, we do O(n) work to merge.
• So, the running time is O(n*log_2(n)).
• Those of you who like recurrence relations can observe that if f(n) is the running time for merge-sorting an array of n elements, then f(n) = 2*f(n/2) + n
• The 2*f(n/2) is for the two recursive calls.
• The n is for the merge
• One function that meets this recurrence (the only one that I know of) is c*n*log_2(n).
• Unfortunately, merge sort requires significantly more memory than do the other sorting routines (you can spend some time trying to come up with an ``in place'' merge sort, but you are quite likely to fail).

### Quicksort

• Is it possible to write an O(n*log_2(n)) sorting algorithm that is based on comparing and swapping, but doesn't require significantly extra space?
• Yes, if you're willing to rely on probabilities.
• In the Quicksort algorithm, you split (partition) the array to be sorted into two pieces, those smaller than or equal to the pivot and those greater than the pivot. You don't include the pivot in either piece (so that the recursive case is ``smaller'').
• You can also split into three parts: those smaller than some middle element, those equal to some middle element, and those larger than some middle element.
• With a little work, you can do this partitioning in place, so that there is no overhead (and so that ``glueing'' is basically a free operation).
• (It is not necesarrily okay to partition the array into two parts: those less than or equal to the element, and those greater than or equal to the element.)
```/**
* Sort an array using Quicksort.
* Pre: All elements in the array can be compared to each other.
* Post: The vector is sorted (using the standard meaning).
*/
public void quickSort(Comparable[] A) {
quickSort(A, 0, A.length-1);
} // quickSort(Comparable[])

/**
* Sort part of an array using Quicksort.
* Pre: All elements in the subarray can be compared to each other.
* Pre: 0 <= lb <= ub < A.length
* Post: The vector is sorted (using the standard meaning).
*/
public void quickSort(Comparable[] A, int lb, int ub) {
// Variables
Comparable pivot;	// The pivot used to split the vector
int mid;		// The position of the pivot
// Base case: size one arrays are sorted.
if (lb == ub) return;
// Pick a pivot and put it at the front of the array.
putPivotAtFront(A,lb,ub);
// Determine the position of the pivot, while rearranging the array.
mid = partition(A, lb, ub);
// Recurse.
if (mid-1>=lb) quickSort(A, lb,mid-1);
if (mid+1<=ub) quickSort(A, mid+1,ub);
} // quickSort

/**
* Split the array given by [lb .. ub] into ``smaller'' and
* ``larger'' elements, where smaller and larger are defined by
* their relationship to a pivot.  Return the index of the pivot
* between those elements.  Uses the first element of the array
* as the pivot.
*/
public int partition(Comparable[] A, int lb, int ub) {
//  Use the first element of the subsequence as the pivot value.
Comparable pivotval = A[lb];
int l=lb; // Elements [lb..l] are all <= pivotval
int r=ub; // Elements [r+1..ub] are all > pivotval
// Keep going until we run out of elements to put in the correct place.
while (l < r) {
// At this point, we know that
//   (1) l < r
//   (2) Elements [lb..l] are all <= pivotval
//   (3) Elements [r+1..ub] are all > pivotval

// Skip over any large elements in the right half
while (pivotval.lessThan(A[r]) && (r > l)) {
--r;
}

// At this point, we know that
//    (1) l <= r (we stop moving r left when we hit l or run out
//                of large elements)
//    (2) elements [lb..l] are all <= pivotval (we haven't moved l)
//    (3) elements [r+1..ub] are all > pivotval (by the for loop)
//    (4) element r is <= pivotval (we either stopped moving when
//        we hit such an element or (a) r = l and (b) l indexes such
//        an element)

// Skip over any small elements in the left half.
while ((A[l].lessEqual(pivotval)) && (l < r)) {
++l;
}

// At this point, we know that
//    (1) l <= r (we stop moving r left when we hit l or
//        possibly sooner; we stop moving l right when we hit
//        r or possibly sooner)
//    (2) elements [lb..l-1] are all <= pivotval (by the for loop)
//    (3) elements [r+1..ub] are all > pivotval (we haven't moved r)
//    (4) element r is <= pivotval (we either stopped moving when
//        we hit such an element or r = l (and l indexes such an
//        element)
//    (5) if l < r then element l is > pivotval (by the for loop)
//    (6) if l = r then element l is <= pivotval

// Do we have a large element in the left and a small element
// on the right?
if (A[l] > A[r]) {
swap(A,l,r);
}
} // while

// At this point, we know that
//    (1) elements [lb..l] are all <= pivotval
//    (2) elements [l+1..ub] are all > pivotval

// Put the pivot in the middle.  Note that at this point, element l is
// <= pivotval, so this is a safe swap
swap(A,lb, l);

// And we're done
return l;
} // partition
```
• I'll admit that I normally get this slightly wrong, which is why there are all those ``verification notes'' in the middle. You may also want to check them.
• What is the running time
• What is the running time of Quicksort? It depends on how well we partition. If we partition into two equal halves, then we can say
• Partitioning a vector of length n takes O(n) steps.
• The time to partition those two partitions into four parts is also O(n).
• If each partition is perfect (splits it exactly in half), we can stop the process after O(log_2(n)) levels.
• This gives a running time of O(n*log_2(n)).
• On average, we don't quite do half, but it's close enough that it doesn't make a significant difference.
• However, bad choice of pivots can give significantly worse running time. If we always chose the largest element as the pivot, this algorithm would be equivalent to selection sort, and would take time O(n*n).

## Better Sorting Techniques

• Are there better sorting techniques (ones that take less than O(nlogn))? Not if you limit yourself to basic operations of comparing and swapping.
• But if you're willing to use extra space and know something about the original data, then you can do better.
• In bucket sort, you create separate ``buckets'' for kinds of elements, put each element into the appropriate bucket, sort each bucket, and then take them out again.
• If you can arrange things so that each bucket contains only a few elements (say no more than four), then the main cost is putting in to buckets and taking out of buckets.
• Usually we choose simple criteria for the buckets, such as the first (or nth) letter in a string.
• Running time can be a constant (as long as you can guarantee the number of items in any bucket)!
• In radix sort, we sort using a binary representation of the things we're sorting. We'll do an example later.

History

• Created Monday, January 11, 1999.
• Added short summary on Friday, January 22, 1999.
• Filled in the details on Monday, March 1, 1999. Based on (and modified from) outline 25 of CS152 98S.
• Fixed a few errors on Saturday, March 13, 1999 (thanks Colin!).

Back to Sorting Algorithms. On to Sorting Lab.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.