# Class 20: Sorting Algorithms

Back to A Stamp Problem. On to Discussion of Exam 1.

Held Friday, February 26

Summary

• The problem of sorting
• In-place vs. out-of-place sorting
• Selection sort
• Insertion sort
• Reading: Java Plus Data Structures, Chapter 11 (which won't be ready until late in the semester).
• Exam 1 due.
• Assignment 6 (Algorithm Analysis) distributed. Due Friday, March 5.

Contents

Handouts

Notes

• Yes, there's a new assignment.
• The first exam is due now.
• I still haven't mastered putting stuff from the SmartBoard online. Today we'll go back to the regular board.
• The Math/CS SEPC is hosting a study break next Monday at 7:30 p.m. All students in Math/CS courses are welcome.

## Algorithm Design Techniques

• You've now seen a number of important algorithm design techniques. Keep these in mind as you encounter new problems, as you will likely need to use them as you design new algorithms.

### Divide and Conquer

• This is the technique we used for both binary search and the efficient exponentiation algorithm.
• You divide the problem size in half at each step.
• In the case of binary search, we threw away the other half.
• In the case of efficient exponentiation, there was no second half.
• In some cases, we'll solve both halves and then combine the solutions.
• In many cases, divide and conquer turns a linear factor into a logarithmic factor.
• (And no, it doesn't work for every problem.)

### Greediness

• This is the failed technique we used as our first solution to the stamps problem.
• Nonetheless, it is applicable to a number of other problems.
• Whenever possible, you choose the largest (or smallest) thing.

### Dynamic Programming

• This is the technique we used to solve the stamps problem.
• It can also be used for Fibonacci (and we'll do so if we have time.
• To solve a problem, you solve many subproblems.
• To improve efficiency, you put the solved subproblems in a table.
• Dynamic programming can turn an exponential algorithm into a polynomial algorithm (even into a linear algorithm!).
• Here's what Aho, Hopcroft, and Ullman (three great computer scientists who have written one of the standard works on algorithms) say about dynamic programming.
In essence, dynamic programming calculates the solution to all subproblems. The computation proceeds from the small subproblems to the larger subproblems, storing the answers in a table. The advantage of the method lies in the fact that once a subproblem is solved, the answer is stored and never recalculated.

## An Introduction to Sorting

• Typically, computer scientists look at collections of problems and attempt to find appropriate generalizations of these problems (or their subproblems).
• By solving the generalized problem, you solve a number of related problems.
• One problem that seems to crop up a lot is that of sorting. Given a list, array, vector, sequence, or file of comparable elements, put the elements in order.
• in order means that each element is no bigger than the next element. (You can also sort in decreasing order, in which case each element is no smaller than the next element.)
• you also need to ensure that all elements in the original list are in the sorted list.
• In evaluating sorting methods, we should concern ourselves with both the running time and the amount of extra storage (beyond the original vector) that is required.
• In place sorting is a special subclass of sorting algorithms in which the original object is modified, and little, if any, extra storage is used.
• For large enough data sets, not all of the elements can be stored in memory. Often, variant algorithms must be used in order to get more efficient operation.
• You may learn about such sorting algorithms in CS302.
• Most often, in-memory sorting is accomplished by repeatedly swapping elements. However, this is not the only way in which sorting can be done.

### Examples

• It's often best to ground sorting algorithms in practical experience.
• I'll try to bring in some things to sort (perhaps CDs) and we'll talk about ways to do it.

## Common Sorting Algorithms

### Selection sort

• Selection sort is among the simpler and more natural methods for sorting.
• In this sorting algorithm, you segment the array into two subparts, a sorted part and an unsorted part. You repeatedly find the largest of the unsorted elements, and put that at the beginning of the sorted part. This continues until there are no unsorted elements.
• Here's my version of selection sort. It is a method of an array-like object that holds a group of information, so the `elementAt` references are to ``the current object''.
```/**
* Sort all the elements in the array.
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(i) <= elementAt(i+1) for all 0 <= i < size()-1.
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
public void selectionSort() {
selectionSort(0,size()-1);
} // selectionSort()

/**
* Sort all the elements in the subarray between lb and ub.
* Pre: 0 <= lb <= ub < size()
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(lb) <= elementAt(lb+1) <= ... elementAt(ub).
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
public void selectionSort(int lb, int ub) {
// Variables
int index;	// Index of the largest element in subrange
// Base case: one element, so it's sorted.  (Don't need to check
// empty subarray because of preconditions.)
if (lb == ub) return;
// Find the index of the largest element in the subrange
index = indexOfLargest(lb,ub);
// Swap that element and the last element
swap(index, ub);
// Sort the rest of the subvector (if there is any)
// Note that we don't have to compare ub-1 to lb, since
//   the preconditions and the base case take care of it.
selection_sort(lb,ub-1);
} // selectionSort

/**
* Find the index of the largest element in a subvector
* Pre: 0 <= lb <= ub < size()
* Pre: the elements in the vector are comparable using
*      a lessEqual method
* Post: returns I s.t. for all i, lb <= i <= ub,
*       elementAt(I) >= elementAt(i)
*/
public int indexOfLargest(int lb, int ub) {
// Variables
int guess;	// Current guess as to index of largest
// Make initial guesses
guess = lb;
// Repeatedly improve our guesses until we've looked at
// all the elements
for(int i = lb+1; i <= lb; ++i) {
if (elementAt(guess).lessEqual(elementAt(i))) {
guess = i;
} // if
} // for
// That's it
return guess;
} // index_of_largest
```
• What's the running time of this algorithm? To sort a vector of n elements, we have to find the largest element in that vector in O(n) steps, and then recurse on the rest. The first recursive call takes O(n-1) steps plus the recursion. And so on and so forth. This makes it an O(n^2) algorithm.
• What's the extra memory required by this algorithm (ignoring the extra memory for recursive calls)? It's more or less O(1), since we only allocate a few extra variables and no extra vectors.
• How much extra memory is required for recursive method calls? This is a tail-recursive algorithm, so there shouldn't be any.

#### An Iterative Version

• We can also rewrite the selection sort method iteratively.
• In the iterative version, we repeatedly put the ``correct'' element at position i.
```/**
* Sort all the elements in the array using iterative selection sort.
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(i) <= elementAt(i+1) for all 0 <= i < size()-1.
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
public void selectionSort() {
// Starting at the top of the array and working your way down
// the array
for (int i = size()-1; i > 0; --i) {
// Put the largest element at the current position.
swap(indexOfLargest(0,i), i);
} // for
} // selectionSort()
```

### Insertion Sort

• Another simple sorting technique is insertion sort.
• Insertion sort operates by segmenting the list into unsorted and sorted portions, and repeatedly removing the first element from the unsorted portion and inserting it into the correct place in the sorted portion.
• This may be likened to the way typical card players sort their hands.
• In approximate code (assuming that we're writing this as part of a class that provides methods for getting indexed elements).
```/**
* Sort the array.
* Pre: the elements in the are are comparable using lessEqual.
* Post: elementAt(lb) <= elementAt(lb+1) <= ... <= elementAt(ub)
* Post: elements are neither added to nor removed.
*/
public void insertionSort(int lb, int ub)
throws Exception
{
// An object that we're about to insert.
Comparable tmp;
// The correct place for that element in the sorted subarray.
int place;
// Initially, we know that the first element is "sorted" (all
// one element lists are sorted), so we step through the elements
// starting with the second element.
for (int i = 2; i < size(); ++i) {
// Grab the element.
tmp = elementAt(i);
// Clear out the element.
setElementAt(i,null);
// Find the place.
place = findPlace(0,i-1);
// Put the element there.
insertElementAt(i,tmp);
} // for(i)
} // insertionSort(int,int)
```
• What's the running time? There are O(n) insertions and O(n) calls to `findPlace()` (which finds the proper place in the vector to insert the element). Each insertion requires O(n) steps, and each place determination takes O(log_2(n)) steps (as long as we can use binary search, so the running time is O(n*(n+log_2(n)) which is O(n^2).
• What's the extra storage? It should be constant.

#### Recursive Insertion Sort

• As you might guess, we can also express this recursively.
```/**
* Sort a subvector.
* Pre: 0 <= lb <= ub < size()
* Pre: the elements in the vector are comparable using
*      a lessEqual method
* Post: elementAt(lb) <= elementAt(lb+1) <= ... <= elementAt(ub)
* Post: elements are neither added to nor removed from the vector.
*/
public void insertionSort(int lb, int ub)
throws Exception
{
// Variables
Comparable tmp;	// The object that we'll be inserting
// Base case: one element, so it's sorted
if (lb == ub) return;
// Remember the element at the end, and then remove it
tmp = elementAt(ub);
removeElementAt(ub);
// Sort all but that element
insertionSort(lb,ub-1);
// Insert the element at the proper place.  This may throw an
// IncomparableException.
insert(findPlace(lb,ub-1,tmp), tmp);
} // insertionSort
```

History

• Created Monday, January 11, 1999.
• Added short summary on Friday, January 22, 1999.
• Added the notes on Thursday, February 25, 1999.
• Added the section on algorithm design techniques on Friday, February 26, 1999 (taken from the previous outline). Also added the stuff on sorting, which was taken from outline 24 of CS152 98S and then modified. Added the iterative selection sort.

Back to A Stamp Problem. On to Discussion of Exam 1.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

This page may be found at http://www.math.grin.edu/~rebelsky/Courses/CS152/99S/Outlines/outline.20.html

Source text last modified Fri Feb 26 09:47:43 1999.

This page generated on Fri Feb 26 09:49:44 1999 by SiteWeaver. Validate this page's HTML.

Contact our webmaster at rebelsky@math.grin.edu