CSC 301.01, Class 15: Pause for breath
Overview
 Preliminaries
 Notes and news
 Upcoming work
 Extra credit
 Questions
 Insertion in 23 trees, revisited
 Deletion in 23 trees
 Btrees
 234 trees vs. redblack trees
News / Etc.
Upcoming work
 Exam 1 due TODAY at 5:00 p.m.
 If you attend the Google talk at 4:00 p.m., you may have until 7:00 p.m.
 Assignment 5 due next Wednesday at 10:30 p.m.
Extra credit (Academic)
 Google tech interview session, Today, 4:005:30 p.m., 3821
 Convocation, Thursday, 11am in JRC 101
 CS Table, Tuesday: Tapia
Extra credit (Peer)
Extra Credit (Misc)
Other good things
 Women’s soccer today at 4:30 p.m.
 Volleyball today at 7:00 p.m.
Questions
 Should the cover sheet be separate?
 Yes.
 What goes on the cover sheet?
 Name
 Number
 Academic honesty statement 1
 Signature and date for AHS 1
 Academic honesty statement 2
 Signature and date for AHS 2
 What does a typical bigO proof look like?
 You identify c and n0 and show why, for all n > n0, f(n) <= c*g(n).
 Transitivity example (also in Skiena)
 f(n) is in O(g(n)), g(n) is in O(h(n)). Want f(n) is in O(h(n))
 Because f(n) is in O(g(n)), there exist c1, n1 such that for all n > n1, f(n) <= c1*g(n). [Eqn1]
 Because g(n) is in O(h(n)), there exist c2, n2 such that for all n > n2, g(n) <= c2*h(n). [Eqn2]
 Intuition: We want to use transitivity of <=.
 For all n > n2, c1g(n) <= c1c2*h(n) [Eqn3] *Algebra: You can multiply both sides of an inequality by a positive number.)
 Intuition: We now have two equations that hold for only some n values. We want to identify the subset of values for which both hold.
 Let n0 be max(n1,n2). Both Eqn1 and Eqn3 hold for all n > n0.
 For all n > n0, f(n) <= c1c2h(n). (Transitivity of <=)
 Let c be c1*c2.
 Therefore, f(n) is in O(h(n))
 How long is your solution to
bstremove
?  40 lines. But I have three calls to
node
, each of which is four lines long.  Did you use a separate helper for
bstremove
to find the appropriate replacement key/value?  No. I just wrote a local helper. It was five lines long.
 For the hash function, should we really use 128. You said “large prime”. 128 is not prime as far as I know.
 I misspoke. alpha is the size of the alphabet.
 Should I worry about
mod
?  If the computation is likely to overflow
int
orlong
, yes.  And yes, it will overflow
 Note that if your computation is greater than INT_MAX / 128, it’s
time to
mod
.
Insertion in 23 trees, revisited
Recall that our process for adding a value to a 23 tree is as follows:
 Follow the path to the appropriate position of the new value.
 You should end up on the fringe of the tree.
 If there’s room in the parent node, just add the value to the parent.
 If there’s not room in the parent node, pretend to expand the node to three values and four children, then split the node at the center value and propagate the center value upward.
Visually
+++
 x  z 
+++
/  \
nil nil nil

v
++++
 x  y  z 
++++
/   \
nil nil nil nil

v
++
 y 
++
/ \
++ ++
 x   z 
++ ++
/   \
nil nil nil nil
But what happens when we move the node up? We may have the same problem. Without loss of generality, let’s say that the node being propagated up is on the right.
+++
 p  r 
+++
/  \
T1 T2 ++
 y 
++
/ \
T3 T4
All of T1
, T2
, T3
, T4
, are the same height. This isn’t quite
a 23 tree because the y
rooted subtree is too tall. So we’ll move up.
++++
 p  r  y 
++++
/   \
T1 T2 T3 T4

v
++
 r 
++
/ \
++ ++
 p   y 
++ ++
/   \
T1 T2 T3 T4
Note that because we end up with a lot of singleton nodes, subsequent additions are likely to be much less costly.
Exercise
Build a 23 tree by successively adding a, c, e, g, i, h, f, d, b.
 Observations: It is hard to predict the structure in advance; it is affected by the order in which we add values.
 Observation: We tend to get singleton nodes scattered throughout the tree, which suggests that we won’t propagate up all that much.
Btrees
23 trees are a specialiation of “Btrees”  search trees in which you have blocks of ordered keys, with links to subtrees in between.
What happens when your search tree doesn’t fit into memory? We will need to read from disk. That’s awfully slow.
234 trees vs. redblack trees
Deletion in 23 trees
Spend about five minutes thinking about how we delete from 23 trees. Assume, as a starting point, that we only delete at leaves. It will also be helpful to think about the structure of the parents and the siblings.