Held Wednesday, May 3, 2000
Overview
Today we move on to a new data structure, the graph. Graphs
are particularly useful in modeling a number of ``real''
problems.
Notes
- In case you've forgotten, here's what's due for the project:
- Working, integrated, code
- A 3-5 page essay that discusses the design of your part (one per group)
- A 1 page reflective essay about what it was like to work on a big,
distributed project
- Optional: distribute 100 units of extra credit to fellow students.
Contents
Summary
- Common graph problems
- The traveling salescritter problem
- Reachabliity: Can you get there from here?
- As you have (hopefully) noted, computer science is not (always) done
for its own sake. Often, we write computer programs to help us
with ``real world'' problems.
- Sometimes, we answer real-world questions by modeling them
with one or more data structures.
- For example, we might model the consumer's model of a store with
a series of queues (plus some methods for choosing how long
between service and entry in each queue).
- There are a number of problems that require nonlinear data structures,
and which need fewer restrictions than trees require.
- Suppose we are building a new campus network. Determine the
least-cost way to connect building so that the network stays
active, even if we cut any one connection. (Alternately, given
the existing system, determine if any one connection will disconnect
parts of the network.)
- In the telephone system, find the least congested route between
two phones, given connections between switching stations.
- On the Web, determine if there is a way to get to one page from
another, just by following normal links.
- ``Six degrees
of Kevin Bacon''.
- While driving, find the shortest path from one city to another.
- As a traveling salescritter who needs to visit a number of cities,
find the shortest path that includes all the cities.
- On the Internet, determine how many connections between
computers you can eliminate and still make sure that
every computer can reach every other computer.
(The Internet was originally funded by the military; they wanted
it robust even if we were attacked.)
- Determine an ordering of courses so that you always take
prerequisite courses first.
- We often draw pictures to help us think about these problems. I
may ask you to draw pictures.
- These, and many other problems, can be modeled by a data structure
known as the graph.
- Previously, we've used data structures to improve particular tasks.
Here, we use the data structure to motivate the algorithms.
- Graphs are data structures that contain labeled
nodes which are connected by edges.
- Sound familiar? We often think of lists and trees similarly (or
at least implementations of lists and trees).
- How do graphs differ from lists and trees?
- The nodes are labeled, so that we can refer to them by name.
- Each node may have multiple edges connected to it.
- In graphs, we don't always distinguish between successors and predecessors.
- Graphs don't necessarily have a unique start node.
- Graphs don't necessarily have designated end nodes.
- In a graph, a node may be its own ``descendant''.
- The graph may have independent subparts.
- Just as each list is a tree (although not a very balanced tree), each
tree is a graph.
- As with lists and trees, we can make the edges unidirectional or
bidirectional.
- If the edges are unidirectional, the graph is called a
directed graph.
- If the edges are bidirectional, the graph is called an
undirected graph.
- In some uses of graphs, we may associate a numeric weight to each
edge. Graphs with weights on edges are called weighted graphs.
Typically, a weight represents the cost to get from one node to another.
- Note that in some graphs it is possible to follow a sequence of edges and
return to the place you started. That path (sequence of edges) is called
a cycle. Graphs with cycles are called cyclic
graphs.
- If there are no cycles in a graph, it is an acyclic graph.
- If you don't say whether or not a graph is cyclic, you are implying
that you will deal with either type.
- Note that directed acyclic graphs (DAGs) have one or more roots and
one or more leaves.
- Because graphs can be disconnected, we often refer to connected
components of graphs. In a connected component, it is possible
to reach every node from every other node.
- Many of you seemed to have heard vague "rumors" about the
traveling salescritter problem (TSP), so we'll begin with
that problem.
- The problem is to find the shortest path through a graph
that visits every node in the graph.
- Just as a salescritter must visit every location in its teritory,
so must this algorithm visit every node.
- As you might guess, this problem is typically considered
for weighted graphs (sometimes directed, sometimes undirected).
- Is there a solution? Certainly.
List every path that visits all the cities
Find the shortest such path
- Is this a good algorithm? No. It is O(n!). How bad
is that?
- 10! = 3,628,800
- 20! = 2,432,902,008,176,640,000
- If we could check and compare the length of one trillion paths
each second, checking 20! paths would take us 2,432,902 seconds
- That's 40548 minutes
- That's 675 hours
- That's about one month
- Surpisingly, no significantly better algorithm is known. That is, all
known algorithms are O(n!). Some just have better constants.
- However, it turns out that if you're willing to accept approximate
answers (e.g., a path no worse than two times as long as the best path),
then there are much faster solutions.