[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Project] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Exams] [Examples] [Book] [Tutorial] [API]

**Held** Thursday, November 4, 1999

**Overview**

Today we will consider *heaps*, one of the more efficient
implementations of priority queues. Heaps are interesting in that
they represent the application of the ``divide and conquer''
technique to data structures.

**Notes**

- I won't be here next Monday. Professor Walker will be teaching class instead.
- Hence, you get an extra two days on assignment 4.

**Contents**

**Summary**

- A ``divide and conquer'' implementation of priority queues
- Heaps
- Basic tree terminology
- Heaps and Arrays
- Heapsort

- We've been discussing linear structures, structures that support
five basic operations
**/** * An interface for linear objects. Designed to support both * static and dynamic implementation. * * @author Samuel A. Rebelsky * @version 1.1 of November 1999 */**__public____interface__Linear {**/** * Add an element. * Pre: There is space for another element. * Pre: The element to be added is not null. * Post: The element is added to the linear collection. * Post: The size of the collection increases by 1. */**__public____void__add(Object elt);**/** * Delete and return an element. Which element is removed is determined * by the ``deletion strategy'' of the particular linear collection. * Pre: The collection is not empty. * Post: Returns an element in the collection. * Post: Deletes an element from the collection. * Post: The size of the collection decreases by 1. */**__public__Object get();**/** * Determine the next object we will get when we call get. * Pre: The collection is not empty. * Post: Returns the ``first'' element in the collection. * Post: Does not otherwise affect the collection. */**__public__Object peek();**/** * Is the collection full? (That is, it has no space remaining.) */**__public____boolean__isFull();**/** * Is the collection empty? (That is, it has no elements.) */**__public____boolean__isEmpty(); } // interface Linear - Priority queues are linear structures in which
- elements have associated priorities
- the element that
`get`

and`peek`

return is the highest-priority element

- At the end of the prior class, we decided on an interesting way to represent priority queues.
- We used ``divide and conquer'' to design this data structure. That is, we broke up the parts of the data structure to build substructures.
- We might express that technique
with the following Java class
__public____class__PriorityQueue {**// +--------+--------------------------------------------------****// | Fields |****// +--------+****/** The highest-priority element of the priority queue. */**__private__Object top;**/** * Half of the remaining elements. Set to null if there are * no other elements. */**__private__PriorityQueue left;**/** * The other half of the remaining elements. Set to null * if there are no other elements. */**__private__PriorityQueue right;**/** * The comparator used to determine priorities. */**__private__Compartor prioritize;**// ...**} // PriorityQueue - We noted that this looks something like a tree.
- We developed a method for deleting the highest-priority element:
- Remove that element. This creates a hole at the root of the tree.
- Put highest-priority root of a subtree in that hole, creating a hole in that subtree.
- Recurse on the subtree.

- That also gave us a method for adding elements:
- Find a hole at the bottom of the tree.
- Put the element there.
- If it has a higher priority than the value above it, swap them (again and again and again).

- We noted that if the tree were balanced, we could do add and
delete in O(log
_{2}*n*) steps. - But we weren't sure that the tree would stay balanced. You could, for example, always find that the smaller element is in the left subtree during deletion.
- We were left with the question of how we ensure that the tree stays balanced.
- We'll address that question today.

- The structures we just described are similar to a traditional
implementation of priority queues known as the
*heap*. - Heaps are a particular form of binary tree designed to provide quick access to the highest-priority element in the tree.
- Unlike our structures, heaps must be balanced (it's part of the definition).
- A heap is
- a
*binary tree*, - that is
*nearly complete*in that- at most one node has one child (the rest have zero or two)
- the nodes on the last level are at the left-hand-side of the level

- and that has the
*heap property*: the value stored in each node is of higher priority than the values stored below it.

- a

- Heaps are a kind of tree. Hence, it is important that we consider some basic tree terminology
- The
*root*is the top or beginning of the tree. - A
*node*is a part of the tree. (While this has the same name as the nodes we often use to implement trees and lists, you should think of it as independent of implementation.) - Most nodes have one or more
*children*. - Each node other than the root has a
*parent*. - Nodes without children are called
*leaves*. - Nodes with children are called
*interior nodes*. - The
*level*of a node is the number of steps from root to that node.- The root is at level 0.
- The direct children of the root are at level 1.
- The children of those nodes are at level 2.
- ...

- The
*depth*of a tree is the largest level of any node in the tree. - The
*size*of a tree is the number of nodes in the tree. - In a
*binary tree*, no node has more than two children.- The children are typically designated as
*left*and*right*.

- The children are typically designated as
- In a
*complete*tree, every level is full (all the interior nodes have the maximum number of children). - We'll return to trees in the weeks to come. For today, we'll stick with the simple heaps we've just defined.

- An essential aspect of heaps is the heap property.
- We can talk about a global heap property (that the value stored at one node in the tree is of higher priority than anything stored below it.
- We can also speak about a local heap property (that the value stored at one node is of higher priority than the two values stored directly below it).
- If the local heap property holds everywhere in the tree, then the
global heap property holds everywhere in the tree.
- Consider the path; we can only be getting larger.

- Here are some heaps of varying sizes
2 2 2 2 2 2 3 / \ / / \ / \ / \ / \ 3 7 3 3 7 3 7 3 7 3 3 / \ | / / \ 9 7 8 9 9 7

- Here are some non-heaps. Can you tell why?
2 2 2 2 2 / \ / \ / \ /|\ | 3 7 9 7 7 3 3 3 3 3 / / \ / \ / \ / \ 9 8 8 9 7 9 7 4 7

- How do we modify heaps? Through insertion and deletion.
- How do we do insertion while maintaining the two key properties?
- We'll maintain one property (near-completeness) and then look to restore the other (heap order).
- It turns out that we add and delete in slightly different ways than we suggested above if we want to maintain near-completeness.

- When we add an element to the heap, we just need to put the element at
the end of the lowest level.
- If that level is full, we put it at the left end of the next level.

- After removing the smallest element from the heap, we have a ``hole'' at the top of the heap. We put the last element in the last level in that hole (which may sound like an odd idea, but at least it maintains the near-completeness property).
- Both of these operations may violate the heap order (in fact, the second one is almost guaranteed to violate heap order), so we need to rearrange the tree.
- This rearrangement is often called
*percolating*. When we put an element at the bottom, we percolate*up*. When we put an element at the top, we percolate*down*.

- Recall that to add an element to a heap we,
- put the element at the end of the lowest level and
- percolate up.

- Percolating up is fairly simple
- At each step, we compare the percolated node to its parent.
- If the percolated node is smaller (violating the heap property), we swap the two and continue up the tree.
- Otherwise, we're done, since the heap property is no longer violated.

- When we do the swap,
- the subtree that contains the old parent is clearly in heap order (the old parent was an ancestor to all the nodes in that subtree, and therefore smaller) and
- the present node is clearly smaller than both of its new subtrees (it's smaller than the old parent, and the old parent was smaller than everything else below it).

- Eventually, we stop (either because we no longer violate heap property or because we reach the root).
- Here's an example, based on inserting the values 5, 6, 4, 4, 7, 2
- Initial tree:
5

- Insert 6, no need to swap.
5 / 6

- Insert 4, swap with root.
5 4 / \ to / \ 6 4 6 5

- Insert 4, swap once
4 4 / \ / \ 6 5 to 4 5 / / 4 6

- Insert 7, no need to swap.
4 / \ 4 5 / \ 6 7

- Insert 2, need to percolate up to root.
4 4 2 / \ / \ / \ 4 5 to 4 2 to 4 4 / \ | / \ | / \ | 6 7 2 6 7 5 6 7 5

- Initial tree:
- How much time does this take? Well, the depth of a complete binary
tree with
*n*nodes is O(log_{2}*n*), and the algorithm may require swapping up from leaf to root, so the running time is also O(log_{2}*n*).

- Recall that to delete the smallest element of the heap we need to
- remove and remember the root;
- put the last element of the last level at the root;
- percolate down;
- return the old root.

- Percolating an element down is slightly more difficult than
percolating up, since there
are two possible subtrees to move to. As you might guess, you must
swap with the root of the smaller subtree and then continue within
that subtree.
- Why don't we worry about increasing the size of the wrong subtree? (As we did in building binary search trees.) Because we're not changing the size of the subtrees. We're swapping elements, but not adding them.

- In some sense, deletion reverses the process of insertion (delete last element in the heap vs. insert last element in heap; percolate down vs. percolate up).
- Here's a sample case of removal of least element.
2 ? 5 3 3 / \ / \ / \ / \ / \ 3 4 to 3 4 to 3 4 to 5 4 to 4 4 / \ | / \ | / \ / \ / \ 6 4 5 6 4 5 6 4 6 4 6 5

- What's the running time? O(log
_{2}*n*) again.

- When considering lists and other structures, we found ways to implement
the structures with both arrays and nodes.
- It seems likely that we can implement heaps with special nodes (as we did at the beginning of class).
- Can we also implement heaps with arrays?

- It turns out to be relatively easy to implement binary heaps and other binary trees, particularly complete binary trees with arrays.
- How?
- Assume we have a complete binary tree in that every interior (nonleaf) node has exactly two children.
- Number the nodes, starting at the top and working across each level.
(The root is node 0, its left child is node 1, the root's right child
is node 2, node 1's left child is node 3, node 1's right child is node
4, ...).
0 / \ 1 2 / \ / \ 3 4 5 6 / \ 7 8

- This numbering gives you the positions in the array for each element.
- (If you don't want to build complete trees and are willing to waste space, you can store a special value to represent ``nothing at this position''.)

- This provides a very convenient way of figuring out where children belong.
- The root of the tree is in location 0.
- The left child of an element stored at location
*i*can be found in location 2**i*+1. - The right child of an element stored at location
*i*can be found in location 2**i*+2 (also representable as 2*(*i*+1)).

- The parent of an element stored at location
*i*can be found at location`floor`

((*i*-1)/2). - Can we prove all this? Yes, but that's an exercise for another day.
- These properties make it simple to move the cursor around the tree and to get values.
- Note that we have an interesting ``double indirection'' here.
- We've decided to implement priority queues with this divide-and-conquer structure.
- We've decided to implement this divide-and-conquer structure with arrays.

- That is, we've given an implementation of an implementation of a data structure.

- We can use the heap structure to provide a fairly simple and quick
sorting algorithm. To sort a set of
*n*elements,- insert them into a heap, one-by-one.
- remove them from the heap in order.

- What's the running time?
- There are
*n*insertions. Each takes O(log_{2}*n*) time by our prior analysis. - There are
*n*``delete least'' operations. Each takes O(log_{2}*n*) by our prior analysis.

- There are
- Hence, we've developed yet another O(
*n*log_{2}*n*) sorting algorithm. - Can we do this in place (provided, of course, that our original information was in an array)? You'll need to think about it.
- Most people implement heap sort with array-based heaps. Some even define heap sort completely in terms of the array operations, and forget the origins.
- Here's the core of heap sort.
__for__(__int__i = 1; i < stuff.length; ++i) { percolateUp(stuff[i]); } // for

Tuesday, 10 August 1999

- Created as a blank outline.

Wednesday, 3 November 1999

- Filled in the details, many of which were taken from outline 46 of CS152 99S
- Significantly rearranged that material.
- Added introduction, based on different topics covered this semester.

Back to Introduction to Linear Structures. On to Linear Structures for Solving Puzzles.

[Instructions] [Search] [Current] [Syllabus] [Links] [Handouts] [Project] [Outlines] [Labs] [More Labs] [Assignments] [Quizzes] [Exams] [Examples] [Book] [Tutorial] [API]

**Disclaimer** Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

This page may be found at http://www.math.grin.edu/~rebelsky/Courses/CS152/99F/Outlines/outline.37.html

Source text last modified Wed Nov 3 16:58:12 1999.

This page generated on Thu Nov 4 20:58:40 1999 by Siteweaver. Validate this page's HTML.

Contact our webmaster at rebelsky@grinnell.edu