# Class 25: Sorting Algorithms

Back to Binary Search. On to More Efficient Sorting Algorithms.

Held Thursday, October 7, 1999

Overview

Today, we'll visit the issue of sorting: turning a collection of elements into a collection in which smaller elements precede larger elements. Our focus will primarily be on sorting arrays.

Notes

• Are there any questions on assignment 3?
• You'll find that these sorting methods look somewhat different from those in the book. That's because we're working in a different context.
• In the book, the methods sort external arrays which are passed as parameters.
• In these notes, the methods sort arrays that are part of the same class.

Contents

Handouts

Summary

• Reading: Java Plus Data Structures, Chapter 12
• The problem of sorting
• Three basic sorting methods:
• Selection sort
• Bubble sort
• Insertion sort
• Selecting a sorting method

## An Introduction to Sorting

• Typically, computer scientists look at collections of problems and attempt to find appropriate generalizations of these problems (or their subproblems).
• By solving the generalized problem, you solve a number of related problems.
• One problem that seems to crop up a lot is that of sorting. Given a list, array, vector, sequence, or file of comparable elements, put the elements in order.
• in order means that each element is no bigger than the next element. (You can also sort in decreasing order, in which case each element is no smaller than the next element.)
• you also need to ensure that all elements in the original list are in the sorted list.
• In evaluating sorting methods, we should concern ourselves with both the running time and the amount of extra storage (beyond the original vector) that is required.
• In place sorting is a special subclass of sorting algorithms in which the original object is modified, and little, if any, extra storage is used.
• For large enough data sets, not all of the elements can be stored in memory. Often, variant algorithms must be used in order to get more efficient operation.
• You may learn about such sorting algorithms in CSC302.
• Most often, in-memory sorting is accomplished by repeatedly swapping elements. However, this is not the only way in which sorting can be done.

### Examples

• It's often best to ground sorting algorithms in practical experience.
• I'll try to bring in some things to sort (perhaps CDs, although I recently sorted mine and am reluctant to unsort them) and we'll talk about ways to do it.

### A Sortable Class

• One of the difficulties of object-oriented design is deciding what belongs where.
• Are the sorting methods methods of some kind of collection class (which seems reasonable: ``You there! Sort yourself!'')
• Are hey applied to some kind of collection (which also seams reasonable: ``Carl and Carla, please sort this stuff'').
• We'll choose the first and build objects that know how to sort themselves. It should be a natural step to the second.
• Here's a class which we might extend by adding a sort method.
```
/**
* A objects of values which can be indexed by
* integers from 0 to size().  Basically, a wrapper
* class for Java's built-in arrays.
*
* @author Samuel A. Rebelsky
* @version 1.0 of September 1999
*/
public class Array {
// +--------+--------------------------------------------------
// | Fields |
// +--------+

/** The elements of the objects. */
public Object[] objects;

// +--------------+--------------------------------------------
// | Constructors |
// +--------------+

/**
* Build a new array which holds up to n elements.
* Initially, each element is null.
*/
public Array(int n) {
objects = new Object[n];
} // Array(int)

/**
* Build a new array which holds the specified
* set of elements.
*/
public Array(Object[] elements) {
objects = new Object[elements.length];
for (int i = 0; i < elements.length; ++i) {
objects[i] = elements[i];
}
} // Array(Object[])

// +-----------+-----------------------------------------------
// | Accessors |
// +-----------+

/**
* Get the ith element of the array.
*
* @exception ArrayIndexOutOfBoundsException
*   For the obvious reasons.
*/
public Object get(int i) {
return objects[i];
} // get(int)

/**
* Get the number of elements in the array.
*/
public int size() {
return objects.length;
} // size()

// +-----------+-----------------------------------------------
// | Modifiers |
// +-----------+

/**
* Set the ith element of the array.
*
* @exception ArrayIndexOutOfBoundsException
*   For the obvious reasons.
*/
public void set(int i, Object value) {
objects[i] = value;
} // set(int, Object)

/**
* Swap the ith and jth elements of the array.
* Included because it's commonly used during sorting.
*/
public void swap(int i, int j) {
Object temp = objects[i];
objects[i] = objects[j];
objects[j] = temp;
} // swap(int,int)

} // class Array

```
• Here's the corresponding interface.
```
import Comparator;
import IncomparableException;

/**
* Things that you can tell to sort themselves.
*/
public interface Sortable {
/**
* Sort the contents of the thing using a comparator
* to compare elements.
* Pre: The elements can be compared.
* Post: The elements are sorted.  Each element is no greater
*   the the next element.
* Post: No elements are added or deleted.
*
* @exception IncomparableException
*   if there are pairs of elements that cannot be compared.
*/
public void sort(Comparator compare)
throws IncomparableException;
} // Sortable()

```

## Common Sorting Algorithms

• Because sorting is such an important task, computer scientists (and normal people, too) have developed a number of techniques that are commonly used for sorting.
• Selection sort is among the simpler and more natural methods for sorting.
• In this sorting algorithm, you segment the array into two subparts, a sorted part and an unsorted part. You repeatedly find the largest of the unsorted elements, and put that at the beginning of the sorted part. This continues until there are no unsorted elements.
• Here's my recursive version of selection sort. It is a method of an array-like object that holds a group of information, so the `elementAt` references are to ``the current object''.
```
import Array;
import Sortable;

/**
* Arrays that you can sort using selection sort.
*
* @author Samuel A. Rebelsky
* @version 1.0 of October 1999
*/
public class SelectionSortable
extends Array
implements Sortable
{

// +--------------+--------------------------------------------
// | Constructors |
// +--------------+

/**
* Build a new array which holds up to n elements.
* Initially, each element is null.
*/
public SelectionSortable(int n) {
super(n);
} // SelectionSortable(int)

/**
* Build a new array which holds the specified
* set of elements.
*/
public SelectionSortable(Object[] elements) {
super(elements);
} // SelectionSortable(Object[])

// +-----------------------+-----------------------------------
// | Methods from Sortable |
// +-----------------------+

/**
* Sort all the elements in the array using selection sort.
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(i) &lt;= elementAt(i+1) for all 0 &lt;= i &lt; size()-1.
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
public void sort(Comparator compare)
throws IncomparableException
{
selectionSort(0,size()-1,compare);
} // sort()

// +----------------------+------------------------------------
// | Local Helper Methods |
// +----------------------+

/**
* Sort all the elements in the subarray between lb and ub.
* Pre: 0 &lt;= lb &lt;= ub &lt; size()
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(lb) &lt;= elementAt(lb+1) &lt;= ... elementAt(ub).
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
protected void selectionSort(int lb, int ub, Comparator compare)
throws IncomparableException
{
// Variables
int index;	// Index of the largest element in subrange
// Base case: one element, so it's sorted.  (Don't need to check
// empty subarray because of preconditions.)
if (lb == ub) return;
// Find the index of the largest element in the subrange
index = indexOfLargest(lb,ub,compare);
// Swap that element and the last element
swap(index, ub);
// Sort the rest of the subarray (if there is any)
// Note that we don't have to compare ub-1 to lb, since
//   the preconditions and the base case take care of it.
selectionSort(lb,ub-1,compare);
} // selectionSort

/**
* Find the index of the largest element in a subarray
* Pre: 0 &lt;= lb &lt;= ub &lt; size()
* Pre: the elements in the vector are comparable using
*      a lessEqual method
* Post: returns I s.t. for all i, lb &lt;= i &lt;= ub,
*       elementAt(I) >= elementAt(i)
*/
protected int indexOfLargest(int lb, int ub, Comparator compare)
throws IncomparableException
{
// Variables
int guess;	// Current guess as to index of largest
// Make initial guesses
guess = lb;
// Repeatedly improve our guesses until we've looked at
// all the elements
for(int i = lb+1; i <= ub; ++i) {
if (compare.lessThan(get(guess),get(i))) {
guess = i;
} // if
} // for
// That's it
return guess;
} // indexOfLargest
} // SelectionSortable

```
• Here's a class that permits us to test it.
```
import SelectionSortable;
import SimpleOutput;
import StringComparator;

/**
* A simple test of selection sort.
*
* @author Samuel A. Rebelsky
* @version 1.0 of September 1999
*/
public class TestSS {
public static void main(String[] args)
throws Exception
{
SimpleOutput out = new SimpleOutput();
SelectionSortable stuff = new SelectionSortable(args);
stuff.sort(new StringComparator());
for (int i = 0; i < stuff.size(); ++i) {
out.println(i + ": " + stuff.get(i));
} // for
} // main(String[])
} // clas TestSS

```
• What's the running time of this algorithm? To sort a vector of n elements, we have to find the largest element in that vector in O(n) steps, and then recurse on the rest. The first recursive call takes O(n-1) steps plus the recursion. And so on and so forth. This makes it an O(n^2) algorithm.
• What's the extra memory required by this algorithm (ignoring the extra memory for recursive calls)? It's more or less O(1), since we only allocate a few extra variables and no extra vectors.
• How much extra memory is required for recursive method calls? This is a tail-recursive algorithm, so there shouldn't be any.

#### Iterative Selection Sort

• We can also rewrite the selection sort method iteratively.
• In the iterative version, we repeatedly put the ``correct'' element at position i.
```
import SelectionSortable;

/**
* Arrays that you can sort using iterative selection sort.
*
* @author Samuel A. Rebelsky
* @version 1.0 of October 1999
*/
public class NewSelectionSortable
extends SelectionSortable
implements Sortable
{

// +--------------+--------------------------------------------
// | Constructors |
// +--------------+

/**
* Build a new array which holds up to n elements.
* Initially, each element is null.
*/
public NewSelectionSortable(int n) {
super(n);
} // NewSelectionSortable(int)

/**
* Build a new array which holds the specified
* set of elements.
*/
public NewSelectionSortable(Object[] elements) {
super(elements);
} // NewSelectionSortable(Object[])

// +-------------------+---------------------------------------
// | Overriden Methods |
// +-------------------+

/**
* Sort all the elements in the array using iterative selection sort.
* Pre: the elements in the array are comparable using
*      a lessEqual method.
* Post: elementAt(i) &lt;= elementAt(i+1) for all 0 &lt;= i &lt; size()-1.
* Post: no element is added to or removed from the subarray.
* Post: no element outside the subarray is affected.
*/
public void sort(Comparator compare)
throws IncomparableException
{
// Starting at the top of the array and working your way down
// the array
for (int i = size()-1; i > 0; --i) {
// Put the largest element at the current position.
swap(indexOfLargest(0,i,compare), i);
} // for
} // sort()

} // NewSelectionSortable

```

### Bubble Sort

• Bubble sort is a lot like selection sort except that instead of finding the largest element and moving it to the end, you swap adjecent elements, thereby ``bubbling'' the largest value to the end.

### Insertion Sort

• Another simple sorting technique is insertion sort.
• Insertion sort operates by segmenting the list into unsorted and sorted portions, and repeatedly removing the first element from the unsorted portion and inserting it into the correct place in the sorted portion.
• This may be likened to the way typical card players sort their hands.
• In approximate code (assuming that we're writing this as part of a class that provides methods for getting indexed elements).
```
import Array;
import Sortable;

/**
* Arrays that you can sort using insertion sort.
*
* @author Samuel A. Rebelsky
* @version 1.0 of October 1999
*/
public class InsertionSortable

extends Array
implements Sortable
{

// +--------------+--------------------------------------------
// | Constructors |
// +--------------+

/**
* Build a new array which holds up to n elements.
* Initially, each element is null.
*/
public InsertionSortable(int n) {
super(n);
} // InsertionSortable(int)

/**
* Build a new array which holds the specified
* set of elements.
*/
public InsertionSortable(Object[] elements) {
super(elements);
} // InsertionSortable(Object[])

// +-----------------------+-----------------------------------
// | Methods from Sortable |
// +-----------------------+

/**
* Sort the array.
*/
public void sort(Comparator compare) {
} // sort(Comparable)

/**
* Sort the array.
* Pre: the elements in the are are comparable using lessEqual.
* Post: get(lb) <= get(lb+1) <= ... <= get(ub)
* Post: elements are neither added to nor removed.
*/
protected void insertionSort(int lb, int ub, Comparator compare)
throws IncomparableException
{
// An object that we're about to insert.
Object tmp;
// The correct place for that element in the sorted subarray.
int place;
// Initially, we know that the first element is "sorted" (all
// one element lists are sorted), so we step through the elements
// starting with the second element.
for (int i = 2; i < size(); ++i) {
// Grab the element.
tmp = get(i);
// Clear out the element.
set(i,null);
// Find the place.
place = findPlace(0,i-1,compare);
// Put the element there.
insertElementAt(i,tmp);
} // for(i)
} // insertionSort(int,int)

/**
* Find the correct position for an element.
* Pre: Elements start through i are sorted
* Post: Returns an i such that elements 0 through i-1 are
*   less than or equal to the element and elements i
*   through end are greater than or equal to the element.
*/
protected int findPlace(int start, int end, Comparator compare)
throws IncomparableException
{
// STUB
return start;
} // findPlace(int,int,Comparator)

/**
* Insert an element at position pos, shifting the remaining
* elements to the right.
*/
protected void insertElementAt(int pos, Object element) {
// STUB
} // insertElementAt(int,Object)
} // InsertionSortable

```
• What's the running time? There are O(n) insertions and O(n) calls to `findPlace()` (which finds the proper place in the vector to insert the element). Each insertion requires O(n) steps, and each place determination takes O(log_2(n)) steps (as long as we can use binary search, so the running time is O(n*(n+log_2(n)) which is O(n^2).
• What's the extra storage? It should be constant.
• How might we code this recursively?

## Choosing a Sorting Method

• All three of these are O(N2). Which should we choose?
• In this case, it turns out that the constants do make a difference. Typically selection sort and insertion sort run much more quickly than does bubble sort.
• Would we ever want to use bubble sort? Yes.
• Sometimes we can only swap neighboring elements. Consider a situation in which you can only store two objects in memory, and the rest in a file. Here's how we might do one round of bubbling up.
```Open the input file
Open a temporary file for the "more sorted" version
Let largestSoFar = the first element in the input file
While (elements remain in the input file)
Let nextElement = the next element in the input file
If nextElement < largestSoFar then
Write nextElement to the temporary file
Else
Write largestSoFar to the temporary file
Set largestSoFar to nextElement
// We've read the whole file (N elements), but only written
// N-1 elements.  Write the last one.
Write largestSoFar to the temporary file
Close the input file
Close the temporary file
Replace the input file with the temporary file
```
• Are there other times we might want to use bubble sort? It turns out that bubble sort is nice on some parallel computers. You can swap a N/2 pairs of adjacent elements in one step
• Round 1: all the cells numbered 2*i swap with 2*i+1 (if out of order)
• Round 2: all the cells numbered 2*i swap with 2*i-1 (if out of in order)
• We may try acting out this last sorting routine.

## History

Tuesday, 10 August 1999

• Created as a blank outline.

Tuesday, 5 October 1999

• Filled in the body, based on outline 20 of CSC152 99S.
• Reformatted.
• Updated the code with more explanations and examples.
• Added short section on bubble sort
• Added section on choosing a sorting method.

Back to Binary Search. On to More Efficient Sorting Algorithms.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.