# Outline of Class 45: Priority Queues and Heaps

Held: Wednesday, April 22, 1998

• Reminder: our next exam is scheduled for Tuesday, April 28. I will not give another assignment until after that exam, and the next assignment will be our final assignment. I've now completed a review sheet for that exam.
• I encourage those of you who haven't read the Kane and Krukowski reports to do so. Learn about what people are recommending for Grinnell.
• On Friday, May 1, the department will be hosting a picnic at Merrill park. Sign up to have fun with the majors, nonmajors, faculty, and families. It's gotta be more fun than the MathLAN, right?
• Start on Bailey, Chapter 13 (Dictionaries) for Friday's class.
• I've been asked to remind you that today is Earth Day. Hug a tree, clean a park, or do whatever else you deem appropriate. I'm working on recycling my office.

## Priority Queues

• Recall that in our discussion of linear structures, we suggested that there could be a number of policies for determining which element is removed. These included FIFO (first in, first out) and LIFO (last in, first out).
• In an priority queue, the element removed is the least (or, perhaps, greatest) element in the structure.
• As in our design of other structures, we need to consider the efficiency of the various operations.
• It may be a goal of the data structure designer to make `removeLeast()` as efficient as possible.
• It may be a goal of the data structure designer to make `insert()` as efficient as possible.
• It may be a goal to balance the various methods
• or ...
• The design choices may be illustrated by an attempt to create a `PriorityQueue` class.
• We could ensure that the elements in the vector are always in order (smallest to largest), so that `removeLeast()` removes the first element.
• `insert()` is then an O(n) operation as we may have to shift all the elements to insert an element in the "correct" place.
• `removeLeast()` is also an O(n) operation, as we need to shift everything left after the removal.
• We could ensure that the elements in the vector are always in order (largest to smallest), so that `removeLeast()` removes the last element.
• `insert()` is still an O(n) operation as we may still have to shift.
• `removeLeast()` becomes an O(1) operation.
• We could leave the elements in the vector unordered, and run a `min()` routine to find the smallest element.
• `insert()` is either O(1) or O(n), depending on whether or not we have to grow the vector.
• `removeLeast()` is now an O(n) operation.

## An array-based implementation of binary trees

• If we restrict ourselves to binary trees (particularly complete binary trees), it is relatively easy to implement trees with arrays.
• How?
• Assume we have a complete binary tree in that every interior (nonleaf) node has exactly two children.
• Number the nodes in the tree in a left-to-right, preorder, breadth-first traversal.
• This numbering gives you the positions in the array for each element.
• (If you don't want to build complete trees and are willing to waste space, you can store a special value to represent "nothing presently at this position".)
• As Bailey suggests in Section 11.4.1, this provides a very convenient way of figuring out where children belong.
• The root of the tree is in location 0.
• The left child of an element stored at location i can be found in location 2*i+1.
• The right child of an element stored at location i can be found in location 2*i+2 (also representable as 2*(i+1).
• The parent of an element stored at location i can be found at location `floor`((i-1)/2).
• Can we prove all this?
• The root is obviously at position 0.
• We may be able to prove the child property by induction.
• We may need to induce on both level of tree and position within that level.
• It may help to have an additional property that we'll also prove using induction. The first element at level i is at position (2^i)-1.
• The root is on level 0 and in position 0. 2^0-1 is 0.
• Assuming this property holds for all i between 0 and k, we need to prove it for k+1.
• The first element on level k+1 appears immediately after the last element on level k (by traversal order).
• The first element on level k appears at position (2^k)-1 (induction hypotheses).
• There are 2^k elements on level k (because it's a complete tree; this might also be proved by induction).
• The last element on level k is at position (2^k)-1+2^k-1 (by previous results)
• The first element on level k+1 is at position (2^k)-1+2^k-1+1 (by previous results)
• This can be simplified to 2^(k+1)-1.
• Using this result, we can prove the child property based on induction over position within level.
• The initial (0th) element on level k is at position 2^k-1. It's left child is the initial element on level k+1 which is at position 2^(k+1)-1. 2*(2^k-1)+1 = 2^(k+1)-1. It's right child is the next element, which is therefore at position 2^(k+1), which is the second form given above.
• The inductive part is trivial.
• We can prove the parent property based on the child property.
• Nodes in odd positions (of the form 2x+1) are left children. Their parents are at position x.
• Nodes in even positions (of the form 2x) are right children. Their parents are at position x-1.
• That fun expression unifies these two concepts.
• These properties make it simple to move the cursor around the tree and to get values. However, they do make it more difficult to do some operations. For example, `setSubtree` might require modifying a large number of cells (since we've decided that it deletes the old subtree).

## Heaps

• Heaps are a particular form of binary tree designed to provide quick access to the smallest element in the tree.
• Heaps are yet another structure that have multiple definitions; I'll use one slightly different from Bailey's.
• A heap is
• a binary tree,
• that is nearly complete in that
• at most one node has one child (the rest have zero or two)
• the nodes on the last level are at the left-hand-side of the level
• and that has the heap property: the value stored in each node is smaller than or equal to the values stored below it.
• Unlike many other data structures we've considered, heaps focus more on implementation than interface.
• (Bailey doesn't require the completeness property, but others do.)
• Here are some heaps of varying sizes
```    2     2   2    2     2      2
/ \        |   / \   / \    / \
3   7       3  3   7 3   7  3   7
/ \  |               /      / \
9   7 8              9      9   7
```
• Here are some non-heaps. Can you tell why?
```    2          2
/ \        / \
3   7      9   7
/   / \    / \
9   8   8  9   7
```
• What good are heaps? They make it very easy to find the smallest element in a group, which is something we've looked for in the past.
• How do we modify heaps? Through insertition and deletion.
• How do we do insertion while maintaining the two key properties (near completeness and heap order)?
• It's clear where the heap expands ... it always expands at the end of the lowest level (if that level is full, it is added to the beginning of the next level).
• Putting the new element there may violate heap order, so we then need to rearrange the tree. The process of rearranging is often called percolating.
• Percolating is fairly simple: The present node is compared to its parent.
• If the present node is smaller (violating the heap property), we swap the two and continue up the tree.
• Otherwise, we're done.
• When we do the swap, the subtree that contains the old parent is clearly in heap order (the old parent was an ancestor to all the nodes in that subtree, and therefore smaller). The present node is clearly smaller than both of its new subtrees (it's smaller than the old parent, and the old parent was smaller than everything else below it).
• Eventually, we stop (either because we no longer violate heap property or because we reach the root).
• Here's an example, based on inserting the values 5, 6, 4, 4, 7, 2
• Initial tree:
```    5
```
• Insert 6, no need to swap.
```    5
/
6
```
• Insert 4, swap with root.
```    5         4
/ \  to   / \
6   4     6   5
```
• Insert 4, swap once
```    4         4
/ \       / \
6  5  to  4   5
/         /
4         6
```
• Insert 7, no need to swap.
```    4
/ \
4   5
/ \
6   7
```
• Insert 2, need to percolate up to root.
```    4           4          2
/ \         / \        / \
4   5  to   4   2 to   4   4
/ \  |      / \  |     / \  |
6   7 2     6   7 5    6   7 5
```
• How much time does this take? Well, the depth of a complete binary tree with n nodes is O(log_2(n)), and the algorithm may require swapping up from leaf to root, so the running time is also O(log_2(n)).
• Can we also do deletion and still maintain the desired properties? Certainly.
• After deleting the root, we move the rightmost leaf to the root. This maintains completeness.
• It may, however, violate the heap property, so we must percolate down.
• Percolating an element down is slightly more difficult, since there are two possible subtrees to move to. As you might guess, you must swap with the root of the smaller subtree and then continue within that subtree.
• In some sense, deletion reverses the process of insertion (delete last element in the heap vs. insert last element in heap; percolate down vs. percolate up).
• Here's a sample case of removal of least element.
```    2          ?          5          3          3
/ \        / \        / \        / \        / \
3   4  to  3   4  to  3   4  to  5   4  to  4   4
/ \  |     / \  |     / \        / \        / \
6   4 5    6   4 5    6   4      6   4      6   5
```
• What's the running time? O(log_2(n)) again.

### Heap Sort

• We can use the heap structure to provide a fairly simple and quick sorting algorithm. To sort a set of n elements,
• insert them into a heap, one-by-one.
• remove them from the heap in order.
• What's the running time?
• There are n insertions. Each takes O(log_2(n)) time by our prior analysis.
• There are n "delete least" operations. Each takes O(log_2(n)) by our prior analysis.
• Hence, we've developed yet another O(n*log_2(n)) sorting algorithm.
• It is not an in-place sorting algorithm, and does require O(n) extra space for the heap.
• Most people implement heap sort with array-based trees. Some even define heap sort completely in terms of the array operations, and forget the origins.

On to Dictionaries and Hash Tables
Back to Electronic Imaging
Outlines: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
Current position in syllabus

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.