15.053

15.053 Tuesday, April 2 • The Shortest Path Problem • Dijkstra’s Algorithm for Solving the Shortest Path Problem Handouts: Lecture Notes

The Minimum Cost Flow Problem A network with costs, capacities, supplies, demands Directed Graph G = (N, A). Node set N, arc set A; Capacities uijon arc (i,j) Lower bound of 0 on arc (i,j) Cost cij on arc (i,j) Supply/demand bi for node i. (Positive indicates supply) Minimize the cost of sending flow s.t. Flow out of i - Flow into i= bi 0 ≤xij ≤uij

Formulation In general the LP formulation is given as Minimize subject to

The Shortest Path Problem What is the shortest path from a source node (often denoted as s) to a sink node, (often denoted as t)? What is the shortest path from node 1 to node 6? Assumptions for this lecture: • There is a path from the source to all other nodes. • All arc lengths are non-negative

Formulation as a linear program In general the LP formulation for the shortest path from a source, s, to a sink, t, is given as Minimize subject to

Another Formulation The LP formulation for the shortest path from a source, s, to all other nodes is given as Minimize subject to

Some Questions Concerning theShortest Path Problem • Where does it arise in practice? • Direct applications • Indirect (and often subtle) applications • How does one solve the shortest path problem? • Dijkstra’s algorithm • How does one measure the performance of an algorithm? • CPU time measurements • Performance Guarantees • How does one establish that a solution is really the shortest path? • Connection to LP duality

Possible sports scores • Flumbaya is an unusual water sport in which there are two types of scores possible. One can score a gymbol, which is worth 7 points, or one can score a quasher, which is worth 5 points. An announcer on TV states that a recent game was won by a score of 19 to 18. Is this possible?

More on Flumbaya There is no path from node 0 to node 18. A score of 18 is impossible.

More on Flumbaya Data: Gymbol is worth n1 points Quasher is worth n2 points: determine whether one can score q points The network: G = (N, A), where N = {0, …, q} for each node j = 0 to q – n1 , (j, j+n1) ∈ A for each node j = 0 to q – n2, (j, j+n2) ∈ A Question: Is there a path in G from node 0 to node q? Fact: if n1 and n2 have no common integer divisor (other than 1 and –1), then the number of scores that cannot be obtained is (n-1)(n2-1)/2. Extra credit for proving this fact. (Warning, it is difficult to prove.)

An indirect application: Finding optimalparagraph layouts TeX optimally decomposes paragraphs by selecting the breakpoints for each line optimally. It has a subroutine that computes the attractiveness F(i,j) of a line that begins at word i and ends at word j-1. How can one use F(i,j) to create a shortest path problem whose solution will solve the paragraph problem? TeX optimally decomposes paragraphs by selecting the breakpoints for each line optimally. It has a subroutine that computes the attractiveness F(i,j) of a line that begins at word i and ends at word j-1. How can one use F(i,j) to create a shortest path problem whose solution will solve the paragraph problem?

An indirect application: finding optimalparagraph layouts Each word corresponds to a node and an arc (i,j) indicates that a line begins with word i and ends at word j-1. A path from Tex to “end” corresponds to a paragraph layout. TeX optimally decomposes paragraphs by selecting the breakpoints for each line optimally. It has a subroutine that computes the attractiveness F(i,j) of a line that begins at word i and ends at word j-1. How can one use F(i,j) to create a shortest path problem whose solution will solve the paragraph problem? that line selecting by line j-1 Tex A value of the path is the “ugliness” of the path. shortest end solve

On the paragraph example • n different yes-no decisions • Decision j: Yes means start a line at word j • No: don’t start a line at word j • The cost of each yes decision depends only on the subsequent yes decision • f(i,j) was the cost of starting a line at word i assuming that word j begins the next line. • Create a shortest path problem with nodes 1, 2, … , n+1 where the cost of arc (i,j) is f(i,j). What is the shortest path from 1 to n+1

An Application in Data Compression:Approximating Piecewise Linear Functions • INPUT: A piecewise linear function • n points a1= (x1,y1), a2= (x2,y2),..., an= (xn,yn). • x1≤ x2≤ ... ≤ xn. • Objective: approximate f with fewer points • c* is the “cost” per point included • cij = cost of approximating the function through points i, i+1, . . ., j-1 by a single line joining point I to point j. (sum of errors, or errors squared.)

Approximating Piecewise Linear Functions • Objective: approximate f with fewer points • c* is the “cost” per point included • c36 = |a4 - b4| + |a5 - b5| = sum of errors. (other metrics would also be OK.)

On approximating functions • n different yes-no decisions • Decision j: yes means select point j • No: don’t select point j • The cost of each yes decision depends only on the subsequent yes decision • cij is the cost of selecting point i followed by point j, and takes into account the cost of selecting i, and the costs of approximating points i+1, …, j-1. • Create a shortest path problem with nodes 1, … , n where the cost of arc (i,j) is cij. What is the shortest path from 1 to n?

Dijkstra’s Algorithm for the ShortestPath Problem Exercise with your partner. Find the shortest paths by inspection. Exercise: find the shortest path from node 1 to all other nodes. Keep track of distances using labels, d(i) and each node’s immediate predecessor, pred(i). d(1)= 0, pred(1)=0; d(2) = 2, pred(2)=1 Find the other distances, in order of increasing distance from node 1.

A Key Step in Shortest Path Algorithms • Let d( ) denote a vector of temporary distance labels. • d(j) is the length of some path from the origin node 1 to node j. • Procedure Update(i)for each (i,j) ∈ A(i) doif d(j) > d(i) + cijthen d(j) : = d(i) + cij and pred(j) : = i; Up to this point, the best path from 1 to j has length 78

A Key Step in Shortest Path Algorithms • Let d( ) denote a vector of temporary distance labels. • d(j) is the length of some path from the origin node 1 to node j. • Procedure Update(i)for each (i,j) ∈ A(i) doif d(j) > d(i) + cijthen d(j) : = d(i) + cij and pred(j) : = i; P(1,j) is a “path” from 1 to j of length 72.

Dijkstra’s Algorithm Initialize distances. LIST = set of temporary nodes begin d(s) : = 0 and pred(s) : = 0; d(j) : = ∞ for each j ∈ N - {s}; LIST : = {s}; while LIST ≠ φ do begin let d(i) : = min {d(j) : j ∈ LIST}; remove node i from LIST; update(i) if d(j) decreases, place j in LIST end end Select the node i on LIST with minimum distance label, and then update(i)

Scan the arcs out of i, and update d( ), pred( ), and LIST An Example The End Find the node i on LIST with minimum distance.

The Output from Dijkstra’sAlgorithm To find the shortest path from node j, trace back from the node to the source. Dijkstra provides a shortest path from node 1 to all other nodes. It provides a shortest path tree.

Comments on Running time • Dijkstra’s algorithm is efficient in its current form. The running time grows as n2. • It can be made much more efficient • In practice it runs in time linear in the number of arcs (or almost so).

The string solution and LP duality Let d(j) denote the distance to node j from the source. d(1) = 0 d(2) <= d(1) + 2; d(5) <= d(2) + 2; d(2) <= d(5) + 2 d(5) <= d(3) + 3; d(3) <= d(5) + 3 etc. Dual: Max d(t)-d(s) s.t. d(s) = 0 d(j) <= d(i) + cij

The string solution Imagine replacing each arc by a string of the same length. Thus arc (1,3) would be replaced by a string of length 4 inches joining node 1 to node 3. Now hold node 1 in one hand and node 6 in the other, and pull until the string is tight.

The string solution Does one get the shortest path from node 1 to node 6? If so, why? Note: In some sense we are maximizing the physical distance from node 1 to node 6.

Summary • Direct and indirect applications for the shortest path problem • Dijkstra’s algorithm finds the shortest path from node 1 to all other nodes in increasing order of distance from the source node. • The bottleneck operation is identifying the minimum distance label. One can speed this up, and get an incredibly efficient algorithm • The string solution optimizes the dual LP as well as the shortest path problem.

Some final comments • The shortest path problem shows up again and again in network optimization • There is an interesting connection with dynamic programming • There are other solution techniques as well. We’ll see one in a later lecture.

15.053

15.053

Presentation Transcript

15.053

15.053 Thursday, April 18

15.053 Tuesday, March 5 Duality – The art of obtaining bounds

15.053 Thursday, March 14 Introduction to Network Flows Handouts: Lecture Notes

15.053 Thursday, May 9

15.053 Tuesday, May 7

15.053 Thursday, March 7 Duality 2 – The dual problem, in general

15.053 Tuesday, May 14

15.053 Thursday, April 25

15.053 Thursday, May 16

15.053

15.053

15.053 Thursday, March 14 Introduction to Network Flows Handouts: Lecture Notes

15.053 February 7, 2002

15.053 February 26, 2002 Sensitivity Analysis presented as FAQs