Held Monday, May 8, 2000
Overview
Today we visit one of the more interesting graph algorithms,
Dijkstra's shortest path algorithm.
Notes
- Schedule for the rest of the week
- Today: Graphs
- Tuesday: More graphs
- Wednesday: Wrapup; Class evals
- Friday: Discussion of exam 3
- Attendance is required on Wednesday
Contents
Summary
- Summary of results of exam 3
- Special problems
- Comments on projects
- In the previous class, we noted that there are a number of
problems that are naturally modeled with graphs.
- What are some of the core graph problems? In no
particular order,
- Reachability. Can you get to B from A?
- Shortest path (min-cost path). Find the path from B to A with
the minimum cost (determined as some simple function of the edges
traversed in the path).
- Minimum spanning tree. Find the ``smallest'' subset of
the edges in which all the nodes are connected.
- Traveling salesman. Find the smallest cost path through all
the nodes.
- Visit all nodes. Traversal.
- Transitive closure. Determine all pairs of nodes that can reach
each other.
- Topological sort. Number the nodes in such a way that any node
has a smaller number than all of its successors.
- We'll consider most of these algorithms in the next few days.
- There are also many variations of each of these. For example,
some versions of traveling salesperson require a cycle (returning
to the start), rather than a path.
- The shortest path algorithm finds the shortest path
from A to B in a graph. The shortest path from A to B is a path
from A to B whose cost (determined by some cost function) is
no greater than any other path from A to B.
- The algorithm generally applies to directed, weighted graphs.
- The cost function is most often the sum of the weights of the edges on
the path, but it might involve other combinations, such as
- the average of the weights
- the product of the weights
- some function of weights and number of edges
- ...
- Typically, the weights on the graph are nonnegative.
- If the graph includes negative weights and cycles, there may be no
shortest path, as there may be a cycle which decreases the cost
each time it is taken.
- Depending on the algorithm we develop, there may also be other restrictions
on weights, structure, or cost function to ensure that there is a clear
shortest path.
- If the graph is unweighted, a slight variant of
the breadth-first connectivity algorithm should work fine.
- Instead of enqueing just labels, we enqueue labels and the
distance to that label.
- However, this won't work for weighted graphs (sometimes six edges
have less total weight than two).
- In general, shortest paths don't involve cycles. If we can guarantee
that this is the case (e.g., if all weights are positive), then
there is a very simple method for finding shortest path.
- List all acyclic paths.
- Compute the cost of each path.
- Pick the smallest such cost.
- This guarantees that we get a correct answer. Why? Because it
explicitly matches the definition of shortest path.
- Are there disadvantages? Possibly.
- What is the running time of this algortihm?
- It is clearly proportional to the number of paths in the graph
times the cost of determing the cost of each path.
- We've already seen that the number of paths is O(n!)
- Note that for graph algorithms, we tend to use n for the
number of nodes in the graph and m for the number of edges.
- A computer scientist named Dijkstra proposed an interesting strategy
for finding the shortest path from A to B.
- He suggested that we can find the shortest path from A
to B by finding the shortest path from A to all nodes in the graph.
(Talk about overkill.)
- His algorithm is generally used with the ``sum of weights'' metric.
- It may also work with selected other metrics.
- His algorithm assumes that edge weights are nonnegative.
- We'll divide the graph into two parts:
- SP.
The nodes whose shortest path we know. For these nodes, we'll store
the distance to that node and the path to that node.
- Est.
The nodes whose shortest path we don't know. For these nodes, we'll
store the shortest known distance to that node (which may not be
the shortest) and the corresponding path.
- Initially, Est contains all the nodes. The
distance from A to itself is 0, and the distance from A to every
other node is some distance greater than the largest distance
in the graph.
- At each step, we'll move one node from Est to SP.
After doing so, we'll update the estimated costs of all of its neighbors.
- Which node to we move? Node s, the one with the smallest
estimated distance.
- Why is this safe? Because the only way to reduce that distance would
be to pick another node, t from Est and have a path
from that node to s such that A to t to s
has smaller distance than our current estimate of the distance form
A to s. But that's not possible, since the cost function
is non-decreasing, and we already know that the cost from A to t
is at least as big as the cost from A to s.
- Note that instead of storing the path, we can simply store the previous
node in the path.
- Dijkstra's algorithm is another example of a greedy algorithm:
at each step in the algorithm, we pick a ``best'' node
- Here's a sample graph to consider. I've used an undirected graph because
the algorithm works equally well for undirected graphs and they're easier
to draw.
G
|
3|
| 1 4 1 9
A---B---C---D---E
| | | |
9| 2| 0| |
| | | |
+---F---+ |
| 6 |
+-----------+
- What is the shortest path from A to E? It may be hard to see
at first (not too hard, but nontrivial).
- Let's see what the algorithm tells us.
- Initially,
- SP = {
}
- Est = {
A:(0,empty)
B:(100,?)
C:(100,?)
D:(100,?)
E:(100,?)
F:(100,?)
G:(100,?)
}
- The smallest distance in that graph is the distance to A, so we
- Move A to SP
- Update the distances to all of its neighbors (B, F, and G which now
have actual paths and distances)
- We now have the following information
- SP = {
A:(0,empty)
}
- Est = {
B:(1,A)
G:(3,A)
F:(9,A)
C:(100,?)
D:(100,?)
E:(100,?)
}
- We move B to SP and update its neighbors. We now know a
path to C and a better path to F (A->B->F has cost 1 + 2 = 3).
- SP = {
A:(0,empty)
B:(1,A)
}
- Est = {
F:(3,B)
G:(3,A)
C:(5,B)
D:(100,?)
E:(100,?)
}
- We could then move F or G to SP. Let's say that we move
F. We now know a shorter path to C (A->B->F->C has cost 3, A->B->C
had cost 5) and a path to E.
- SP = {
A:(0,empty)
B:(1,A)
F:(3,B)
}
- Est = {
G:(3,A)
C:(3,F)
E:(9,F)
D:(100,?)
}
- We can then move G or C to SP. Let's say that we move G.
G has no neighbors, so there are no other changes.
- SP = {
A:(0,empty)
B:(1,A)
F:(3,B)
G:(3,A)
}
- Est = {
C:(3,F)
E:(9,F)
D:(100,?)
}
- Next we move C. We can then update the distance to D.
- SP = {
A:(0,empty)
B:(1,A)
F:(3,B)
G:(3,A)
C:(3,F)
}
- Est = {
D:(4,C)
E:(9,F)
}
- We then move D. Note that even though there is an edge from
D to E, it doesn't give us any better path, so we don't change
the entry for E.
- SP = {
A:(0,empty)
B:(1,A)
F:(3,B)
G:(3,A)
C:(3,F)
D:(4,C)
}
- Est = {
E:(9,F)
}
- We move E and we're done
- SP = {
A:(0,empty)
B:(1,A)
F:(3,B)
G:(3,A)
C:(3,F)
D:(4,C)
E:(9,F)
}
- Est = {
}
- What's the shortest path from A to E? It has cost 9 and is the
reverse of (E-F-B-A). How did I get that path? I read it off
from the second elements of SP.
Tuesday, 18 January 2000
- Created as a blank outline.
Monday, 8 May 2000
Tuesday, 9 May 2000
- Moved uncovered sections to the next outline.
Back to Discussion of Exam 3.
On to Graphs, Concluded.