# Class 45: Introduction to Trees

Back to Project Discussion, Continued. On to Heaps and Heap Sort.

Held Monday, April 26

Summary

• Representing arithmetic expressions
• And evaluating them
• Generalizing search trees and arithmetic expression trees: a Binary Tree ADT
• Generalizing binary trees: a Tree ADT

Contents

Handouts

Notes

• Don't forget that your ``bare class'' files for the project are due tomorrow.
• You should have stubs for all of the methods that you expect others to use. You should also have documentation (pre and postconditions, etc.) Your classes should compile. (Use empty classes for the classes you need to rely on.)
• GUI folks need to have their buttons and fields in place, although they need not do anything.
• Give me printed copies and email me copies. I'll set up a common area in which we can share versions.
• We'll make photocopies and distribute them to everyone on Wednesday.
• We'll discuss things on Friday.
• The third exam will be ready on Friday. It will cover linear structures, dictionaries, and trees.

## Nonlinear Structures

• Until recently, all of the ``ordered'' structures we considered (lists, arrays, vectors, stacks, and queues) were linear structures.
• What makes a structure linear?
• Exactly one first element.
• Exactly one last element.
• Each element (except the first) has exactly one predecessor.
• Each element (except the last) has exactly one successor.
• Search trees had a somewhat different structure. How are search trees different?
• There is still exactly one first element.
• There are now many last elements.
• Each element (except the first) still has exactly one predecessor.
• Each element (except the last) may have many (okay, two) successors.
• We call search trees nonlinear structures.
• By careful consideration of such restrictions, we can develop a number of different interesting structures.

## An Introduction to Trees

• We can generalize search trees to develop a data structure commonly called trees.
• Trees are a collection of values,
• stored in nodes,
• with a designated root (first element)
• and a number of leaves (``last'' elements).
• Every non-leaf node has one or more children (successors)
• and the leaves have zero children,
• Every nonroot node has exactly one parent (previous element)
• and the root has no parents.
• In binary trees, each node can have at most two children, often designated as left and right.

### Applications of Trees

• There are many ways in which trees are used, including,
• search trees, which are used to store information for retrieval;
• decision trees, which are used to describe decision processes;
• class hierarchies, which show the relationships between classes in a single-inheritance language like Java;
• representing arithmetic expressions, using operators for internal nodes and values for leaves; and even
• program trees, which show the structure of a program.
• You've already seen binary search trees:
• A node stores a value.
• All smaller values are in the left subtree.
• All larger values are in the right subtree.
• Decision trees can formalize or describe decision processes, such as twenty questions or the animals game.
```        Does it fly?
yes  /           \  no
/             \
Is it a bird?   Does it swim?
yes /    \ no       ...
/      \
...    Is it nocternal?
yes /     \
/     ...
Bat
```
• Arithmetic trees provide an unambiguous way to represent arithmetical expressions. For example, we might write 3+4*5-6 as
```      +
/ \
3   -
/ \
*   6
/ \
4   5
```
• These have a simple evaluation strategy: evaluate both subtrees and then apply the operator.

### Interfaces for Trees

• We now have a high-level understanding of trees, and should be ready to consider how we might describe the tree data structure for a client.
• One possibly confusing issue is the relationship between the nodes in a tree and the tree itself.
• Is there a difference?
• If so, should we still make the underlying node-based implementation clear?
• Are there other possible implementations?
• Some design issues may be complicated by our applications of trees. This is one of the cases in which it may be difficult to design the generic data structure before considering all the applications. Nonetheless, we will make an initial attempt.
• We have some initial design decisions to make:
• Do we distinguish internal nodes from leaves (other than by noting that leaves have no children)?
• Can we delete parts of a tree? What happens when we do?
• Do we return trees from various operations (e.g., `getChild`) or do we have cursors that traverse the tree?
• If we have cursors, are they part of our tree data structure, or separate?
• If we have cursors, can they move, or simply observe?
• Can you have an nth child, but not an (n-1)st child?
• Here are the design decisions I've made:
• We want to separate ``tree'' from ``tree node''.
• Just as we no longer use cdr (which returns lists), we don't want to return trees from operations.
• We don't formally distinguish internal nodes from leaves (although we do supply an `atLeaf` operation).
• We use cursors, but they are ``separate'' from the tree. (Every cursor has a tree it traverses; each tree may have zero or more cursors.) Cursors cannot move.
• Most of the cool stuff goes on in cursors, rather than in trees.
• What do we need to do in trees? Not much, initially. Perhaps
• Create them.
• Make a new cursor that uses them. (More a cursor function, but may also be useful in trees.)
• Determine their size and depth.
• What do we need to do with cursors?
• Create them (at the root of the tree, or on child nodes).
• Get the value at a cursor.
• Change the value at a cursor.
• Determine if a cursor is at a leaf.
• Delete a subtree below a cursor.
• Add a new value below a cursor.
• For example, we might want to write our ``evaluate expression tree'' function as:
```  /**
* Evaluate a binary "expression tree" at or below a cursor.
*/
public int evaluate(Cursor expression) {
// If we're at a leaf, it's a number.
if (expression.isLeaf()) return
computeBaseValue(expression.getValue());
// If we're not at a leaf, it's an operator.
else {
Operator op = (Operator) expression.getValue();
Cursor left = expression.getChild(0);
Cursor right = expression.getChild(1);
int leftVal = evaluate(left);
int rightVal = evaluate(right);
return apply(op, leftVal, rightVal);
}
} // evaluate(Cursor)
```
• Putting it together, we get something like
```public interface Tree {
/** Create a new empty tree. */
public Tree();
/** Determine the depth of the tree. */
public int depth();
/** Determine the number of values in the tree. */
public int size();
/** Return a cursor for the root. */
public Cursor root();
} // interface Tree
public interface Cursor {
/** Create a new cursor for a tree. */
public Cursor(Tree t);
/**
* Get the cursor for a child.  The children are numbered
* from 0 to #children-1.
* Pre: The current subtree has that child.
*/
public Cursor getChild(int childNum);
/** Find out how many children are below the current cursor. */
public int numChildren();
/** Ensure that the nth child is defined. */
public boolean hasChild(int childNum);
...
} // interface Cursor
```

### Implementing Trees with Nodes

• The easiest (and most common) way to implement trees is with nodes, similar to list nodes.
• Here is a sample node class
```public class TreeNode {
/** The contents of the current node. */
protected Object contents;
/** The children of the current node. */
protected Vector children;
/** Create a new node with no children. */
public TreeNode(Object value) {
this.contents = value;
this.children = new Vector();
} // TreeNode(Object)
/** Set a child. */
public void setChild(int childNum, TreeNode child) {
// Make sure the vector is big enough.
if (children.size() <= childNum) {
children.setSize(childNum+1);
}
// Set the appropriate element.
this.setElementAt(child, childNum);
} // setChild(int, TreeNode)
/** Check if it has a child. */
public boolean hasChild(int childNum) {
return (children.size() <= childNum) &&
(children.elementAt(childNum) != null);
} // hasChild(int)
/**
* Get a child.
* Pre: Has that child.
*/
public TreeNode getChild(int childNum) {
return children.elementAt(childNum);
} // getChild
// ...
} // class TreeNode
```
• Note that TreeNodes are quite similar to Cursors (at least in the operations they provide). It is still useful to treat them as separate kinds of things.
• Tomorrow, we'll see a different way to implement binary trees in which cursors will be indices into an array.

History

• Created Monday, January 11, 1999.
• Added short summary on Friday, January 22, 1999.
• Filled in the details on Monday, April 26, 1999.

Back to Project Discussion, Continued. On to Heaps and Heap Sort.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.