Algorithms and OOD (CSC 207 2014F) : EBoards

CSC207.01 2014F, Class 46: Hash Tables, Continued


Overview

Preliminaries

Admin

Upcoming Work

Extra Credit

Academic

Peer Support

Administrative Questions

Questions on the Project

How do we have 18 games in 16 dates?

You don't. There are 18 dates (well, really 19).

Any hints?

The Wisconsin site says that they use Linear Programming.

Questions on Hash Tables

How can it be constant time when we have linked structures (or have to deal with linear probing)?

It's expected constant time. In some cases, it's a little bit more. "Hand wavy proof" - If the values distribute well and the table is big enough, the odds of any cell having more than five elements is small enough that we don't care. And O(5) is the same as O(1).

When we add the word expected, we no longer look at worst case. Similar to Quicksort, which is worst case O(n^2), but expected case O(nlogn).

How do you decide how long the chains should be?

Just use a linked list ... whatever ends up happening, happens.

At some point, you have so many elements in the table that you rehash.

What's the *load factor*?

The ratio of the number of elements to the size of the table. After a certain load, we expand the size of the table. It's important ... if you have too high a load factor, you have lots of collisions, and that slows things down. If you have too low a load factor, you are wasting storage resources. You should analyze the particular problem domain to see where in the time/space tradeoff you are.

Tell me about collisions

Collisions happen when two values are supposed to go into the same spot. Sometimes that happens because they have the same hash value. Sometimes it happens because they have different hash values, but when we mod, the end up at the same location. Expanding helps with the second problem.

Why do we worry about collisions with bucketed/linked hash tables?

We don't want too many things in one bucket. A high load factor implies that some buckets will have a lot of things. Expanding helps.

Example

Hash values are 21 31 41 51 61 71 81. If our table has size 10 ... If our table has size 20 ... If our table has size 19 ...

Could you give me an example of a load factor?

See the lab.

How do we expand the hash table?

See the lab. Conceptually: Build a new table and rehash all of the values from the old table.

Do we consider the cost of expansion in analysis?

It depends on the class. We don't think about it carefully in this class, but we do think about it in 301. Often, people consider the amortized effort (distributed over all of the values).

Lab