• Java 2 Platform Standard Edition 5.0 API specification
• OpenJDK source code repository for library classes
• Source code from Data structures and problem-solving using Java, third edition
As Weiss explains in chapter 18, the height of a tree is one greater than
the height of its highest subtree, or zero if it has no subtrees. It is
straightforward to compute the height of a binary search tree by adopting
the convention that the height of null is -1 (base case) and using
recursion on any non-null binary search tree to find the heights of
its subtrees, taking the larger, and adding one.
Different binary search trees constructed from the same data values can have different heights. For instance, the binary search trees in Figure 19.20 on page 646 of our textbook are all constructed from the same data (the integers 1, 2, and 3), but the height of tree (c) is 1, while all the other trees have height 2.
The data in this example can be arranged in six different orders, and the structure of the constructed binary search tree depends on the order in which the data are added. Four of the six orders result in trees of height 2, while the other two orders result in tree (c), which has height 1. If we average over all possible orders, the mean height of the resulting tree is (2 + 2 + 2 + 2 + 1 + 1)/6, or 5/3.
As the data sets and trees get larger, it becomes more difficult to work out by hnad what all the possible shapes of the binary search trees are and how many different orderings of the data might result in trees of each shape. One way to get some idea of how high the binary search trees will be, for data sets of a given size, is to use a random-number generator to construct random orderings of data sets of that size, actually build the binary search trees from them, and measure their height.
n
as argument and returns an array of Integer, of size n,
containing the integers from 0 to n - 1 in a random order (each
wrapped as an Integer object).BinarySearchTree class that
takes an array as its argument and constructs and returns a binary search
tree containing the elements of the array. Add the elements in the order
in which they occur in the array.Integer values and computes and outputs the mean height of
the binary search trees constructed from them.
It is also possible to calculate the mean directly, rather than estimating
it. The calculation is based on the observation that the first value added
to a binary search tree always becomes its root, so that the number of
elements in each subtree is completely determined by that choice. For each
possible choice r of the root, therefore, the mean of the heights
binary search trees with root r is the result of adding 1 to the
mean of the heights of the left subtrees (containing elements smaller than
r) or the mean of the heights of the right subtrees (containing
elements larger than r), whichever is greater. But, if we're
considering all possible ways of arranging the data as equally likely, any
datum is equally likely to be added first and thus to become the root. So
we can figure the mean height for each possible choice of root and simply
average the results.
For instance, if there are four items in the data set, the root is equally likely to be the smallest, the second smallest, the second largest, or the largest datum. In the first and last of these cases, the completed binary search tree will have one subtree containing three items and one with none. The larger subtree, the one with three items, will have a mean height of 5/3, as we have seen, so the mean height of the overall tree in these cases will be 5/3 + 1, or 8/3. In the other two cases, the completed binary search tree will have one subtree containing two items and one with one. The larger subtree will have a mean height of 1 (indeed, it will always have a height of 1), so the overall tree will have a mean height of 2. Thus, combining the four cases, the mean height of a binary search tree containing four items will be (8/3 + 2 + 2 + 8/3)/4, or 7/3.
This assignment will be due on Tuesday, April 29.