Fundamentals of Computer Science II (CSC-152 97F)
Outline of Class 44: Heaps
- As most of you know, there is a lot of student discussion about
student representatives to the presidential search board. In case
you hadn't heard, there is an election from 8-4 today in the
campus post office. You can find the candidates slates by fingering
jones1 or on a web page at
- I had allocated a lot of my weekend time to grading. However, I also
participated in a half-day faculty discussion group on minority faculty,
which I felt was important. My 223 homeworks took significantly longer
to grade than I expected, so I still haven't graded your stuff (and
don't know when I will be able to do so).
- I've gotten at least one request to make the tree assignment your last
assignment. It is likely to be your last programming assignment,
although I may still assign one or two more written assignments
that should take significantly less time.
- In conjunction with that decision, we may spend a little less time on
coding details in the days to come.
- I'm working on a paper that discusses issues in the transition from
Scheme to Java. If you have any comments, I'd appreciate them (even
if you made the transition from another language to Java). Here
are some things that I've noted:
- The use of a module system can be confusing.
- The transition to an object-oriented model is difficult, particularly
without a good set of examples (and, possibly, a text).
- The transition from implicitly to explicitly returning values from
a function is difficult.
- Exception handling is a much different way of looking at error
checking, one that even faculty (me) and book writers (Bailey)
have difficulty with.
- The presence of a documentation-generator, like Javadoc, affects
the way we document our code.
- The transition from recursion to iteration was not too difficult.
- The understanding of variables and references was not too difficult.
- Pieces of sample Java code (and the Java compiler) require you to
know more about the language than is often necessary. E.g.,
references to abstract classes and protection levels.
- It wasn't possible to begin talking about data structure implementation
until after a few weeks discussing some of these issues.
- If we restrict ourselves to binary trees, it is relatively easy
to implement trees with arrays.
- Assume we have a complete binary tree in that every interior (nonleaf)
node has exactly two children.
- Number the nodes in the tree in a left-to-right preorder breadth-first
- This numbering gives you the positions in the array for each element.
- (You can store a special value to represent "nothing presently at this
- As Bailey suggests in Section 12.4.1, this provides a very convenient
way of figuring out where children belong.
- The root of the tree is in location 0.
- The left child of an element stored at location i can
be found in location 2*i+1.
- The right child of an element stored at location i can
be found in location 2*i+2 (also representable as
- The parent of an element stored at location i can be
found at location
- Can we prove all this?
- The root is obviously at position 0.
- We may be able to prove the child property by induction.
- We may need to induce on both level of tree and position within
- It may help to have an additional property that we'll also
prove using induction. The first element at level i is
at position (2^i)-1.
- The root is on level 0 and in position 0. 2^0-1 is 0.
- Assuming this property holds for all i between 0
and k, we need to prove it for k+1.
- The first element on level k+1 appears immediately
after the last element on level k (by traversal
- The first element on level k appears at position
(2^k)-1 (induction hypotheses).
- There are 2^k elements on level k (because
it's a complete tree; this might also be proved by induction).
- The last element on level k is at position
(2^k)-1+2^k-1 (by previous results)
- The first element on level k+1 is at position
(2^k)-1+2^k-1+1 (by previous results)
- This can be simplified to 2^(k+1)-1.
- Using this result, we can prove the child property based on
induction over position within level.
- The initial (0th) element on level k is at position
2^k-1. It's left child is the initial element on
level k+1 which is at position 2^(k+1)-1.
2*(2^k-1)+1 = 2^(k+1)-1.
It's right child is the next element, which is therefore at
position 2^(k+1), which is the second form given above.
- The inductive part is trivial.
- We can prove the parent property based on the child property.
- Nodes in odd positions (of the form 2x+1) are left children.
Their parents are at position x.
- Nodes in even positions (of the form 2x) are right children.
Their parents are at position x-1.
- That fun expression unifies these two concepts.
- These properties make it simple to move the cursor around the tree
and to get values. However, they do make it more difficult to
do some operations. For example,
require modifying a large number of cells (since we've decided that
it deletes the old subtree).
- Heaps are a particular form of binary tree designed to provide
quick access to the smallest element in the tree.
- Heaps are yet another structure that have multiple definitions;
I'll use one slightly different from Bailey's.
- A heap is
- a binary tree,
- that is nearly complete in that
- at most one node has one child (the rest have zero or two)
- the nodes on the last level are at the left-hand-side of the
- and that has the heap property: the value stored in
each node is smaller than or equal to the values stored below it.
- Unlike many other data structures we've considered, heaps focus
more on implementation than interface.
- (Bailey doesn't require the completeness property, but others do.)
- Here are some heaps of varying sizes
2 2 2 2 2 2
/ \ | / \ / \ / \
3 7 3 3 7 3 7 3 7
/ \ | / / \
9 7 8 9 9 7
- Here are some non-heaps. Can you tell why?
/ \ / \
3 7 9 7
/ / \ / \
9 8 8 9 7
- What good are heaps? They make it very easy to find the smallest element
in a group, which is something we've looked for in the past.
- How do we modify heaps? Through insertition and deletion.
- How do we do insertion while maintaining the two key properties
(near completeness and heap order)?
- It's clear where the heap expands ... it always expands at the end
of the lowest level (if that level is full, it is added to the beginning
of the next level).
- Putting the new element there may violate heap order, so we then need
to rearrange the tree. The process of rearranging is often called
- Percolating is fairly simple: The present node is compared to its parent.
- If the present node is smaller (violating the heap property), we swap
the two and continue up the tree.
- Otherwise, we're done.
- When we do the swap, the subtree that contains the old parent is clearly
in heap order (the old parent was an ancestor to all the nodes in that
subtree, and therefore smaller). The present node is clearly smaller
than both of its new subtrees (it's smaller than the old parent, and
the old parent was smaller than everything else below it).
- Eventually, we stop (either because we no longer violate heap property
or because we reach the root).
- Here's an example, based on inserting the values 5, 6, 4, 4, 7, 2
- How much time does this take? Well, the depth of a complete binary
tree with n nodes is O(log_2(n)), and the algorithm may require swapping
up from leaf to root, so the running time is also O(log_2(n)).
- Can we also do deletion and still maintain the desired properties?
- After deleting the root, we move the rightmost leaf to the root.
This maintains completeness.
- It may, however, violate the heap property, so we must percolate
- Percolating an element down is slightly more difficult, since there
are two possible subtrees to move to. As you might guess, you must
swap with the root of the smaller subtree and then continue within
- In some sense, deletion reverses the process of insertion (delete last
element in the heap vs. insert last element in heap; percolate down vs.
- Here's a sample case of removal of least element.
2 ? 5 3 3
/ \ / \ / \ / \ / \
3 4 to 3 4 to 3 4 to 5 4 to 4 4
/ \ | / \ | / \ / \ / \
6 4 5 6 4 5 6 4 6 4 6 5
- What's the running time? O(log_2(n)) again.
- We can use the heap structure to provide a fairly simple and quick
sorting algorithm. To sort a set of n elements,
- insert them into a heap, one-by-one.
- remove them from the heap in order.
- What's the running time?
- There are n insertions. Each takes O(log_2(n)) time by our
- There are n "delete least" operations. Each takes O(log_2(n))
by our prior analysis.
- Hence, we've developed yet another O(n*log_2(n)) sorting algorithm.
- It is not an in-place sorting algorithm, and does require O(n)
extra space for the heap.
- Most people implement heap sort with array-based trees. Some even
define heap sort completely in terms of the array operations, and
forget the origins.
Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.
Source text last modified Mon Nov 24 09:00:36 1997.
This page generated on Mon Nov 24 09:02:19 1997 by SiteWeaver.
Contact our webmaster at email@example.com