1.05k likes | 1.07k Views
Lecture 2: Dynamic Programming. 主講人 : 虞台文. Content. What is Dynamic Programming? Matrix Chain-Products Sequence Alignments Knapsack Problem All-Pairs Shortest Path Problem Traveling Salesman Problem Conclusion. Lecture 2: Dynamic Programming. What is Dynamic Programming?.
E N D
Lecture 2: Dynamic Programming 主講人:虞台文
Content • What is Dynamic Programming? • Matrix Chain-Products • Sequence Alignments • Knapsack Problem • All-Pairs Shortest Path Problem • Traveling Salesman Problem • Conclusion
Lecture 2: Dynamic Programming What is Dynamic Programming?
What is Dynamic Programming? • Dynamic Programming (DP) tends to break the original problem to sub-problems, i.e., in a smaller size • The optimal solution in the bigger sub-problems is found through a retroactive formula which connects the optimal solutions of sub-problems. • Used when the solution to a problem may be viewed as the result of a sequence of decisions.
Properties for Problems Solved by DP • Simple Subproblems • The original problem can be broken into smallersubproblems with the same structure • Optimal Substructure of the problems • The solution to the problem must be a composition of subproblem solutions (the principle of optimality) • Subproblem Overlap • Optimal subproblems to unrelated problems can contain subproblemsin common
The Principle of Optimality • The basic principle of dynamic programming • Developed by Richard Bellman • An optimal path has the property that whatever the initial conditions and control variables (choices) over some initial period, the control (or decision variables) chosen over the remaining period must be optimal for the remaining problem, with the state resulting from the early decisions taken to be the initial condition.
10 5 3 Example: Shortest Path Problem Start Goal
25 10 28 5 40 3 Example: Shortest Path Problem Start Goal
Example: Shortest Path Problem 25 10 28 5 Start Goal 40 3
Is the greedy solution optimal? Recall Greedy Method forShortest Paths on a Multi-stage Graph • Problem • Find a shortest path from v0 to v3
Is the greedy solution optimal? Recall Greedy Method forShortest Paths on a Multi-stage Graph • Problem • Find a shortest path from v0 to v3 The optimal path
Lecture 2: Dynamic Programming Matrix Chain-Products
f B j e e A C d i i,j d f Matrix Multiplication • C = A × B • Aisd × eandBise × f • O(def )
Matrix Chain-Products • Given a sequence of matrices, A1,A2,…,An, find the mostefficient way to multiply them together. • Facts: • A(BC) = (AB)C • Different parenthesizing may need different numbers of operation. • Example: A:10 × 30, B: 30 × 5, C : 5 × 60 • (AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 ops • A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 ops
Matrix Chain-Products • Given a sequence of matrices, A1,A2,…,An, find the mostefficient way to multiply them together. • A Brute-force Approach: • Try all possible ways to parenthesize A=A1A2…An • Calculate number of operations for each one • Pick the best one • Time Complexity: • #paranethesizations = #binary trees of n nodes • O(4n)
A Greedy Approach • Idea #1: • repeatedly select the product that uses the most operations. • Counter-example: • A: 10 5, B: 5 10, C: 10 5, and D: 5 10 • Greedy idea #1 gives (AB)(CD), which takes 500+1000+500 = 2000 ops • A((BC)D) takes 500+250+250 = 1000 ops
Another Greedy Approach • Idea #2: • repeatedly select the product that uses the least operations. • Counter-example: • A: 101 11, B: 11 9, C: 9 100, and D: 100 999 • Greedy idea #2 gives A((BC)D), which takes 109989+9900+108900=228789 ops • (AB)(CD) takes 9999+89991+89100=189090 ops
DP Define Subproblem Subproblem (Pij, i j) Original Problem (P1n) Suppose #operations for the optimal solution of Pij is Nij #operations for the optimal solution of the original problem P1n is N1n
DP Define Subproblem Subproblem (Pij, i j) Original Problem (P1n) Suppose #operations for the optimal solution of Pij is Nij #operations for the optimal solution of the original problem P1n is N1n
DP Define Subproblem What is the relation btw Nij(Pij) and N1n (P1n)? Subproblem (Pij, i j) Original Problem (P1n) Suppose #operations for the optimal solution of Pij is Nij #operations for the optimal solution of the original problem P1n is N1n
DP Principle of Optimality dkdj+1 didk+1 Nk+1,n Nik
1 2 j n 1 2 i n DP Implementation Nij
1 2 j n 1 2 i n DP Implementation Nij
1 2 j n 1 2 i n DP Implementation Nij ?
DP Implementation Nij 1 2 j n 1 2 ? i n
DP Implementation Nij 1 2 j n 1 2 ? i n
DP Implementation Nij 1 2 j n 1 2 i n
DP Implementation Nij 1 2 j n 1 ? 2 i n
DP Implementation Nij 1 2 j n 1 2 ? i n
DP Implementation Nij 1 2 j n 1 2 i n
DP Implementation Nij 1 2 j n 1 ? 2 i n
DP Implementation Nij 1 2 j n 1 2 i ? n
DP Implementation Nij 1 2 j n 1 2 i n
DP for Matrix Chain-Products AlgorithmmatrixChain(S): Input:sequence S of n matrices to be multiplied Output:number of operations in an optimal parenthesization of S fori1ton// main diagonal terms are all zero Ni,i 0 ford 2to n// each diagonal do following fori1to nd+1// do from top to bottom for each diagonal j i+d1 Ni,j infinity fork ito j1// counting minimum Ni,j min(Ni,j,Ni,k +Nk+1,j +di dk+1 dj+1)
Time Complexity AlgorithmmatrixChain(S): Input:sequence S of n matrices to be multiplied Output:number of operations in an optimal parenthesization of S fori1ton// main diagonal terms are all zero Ni,i 0 ford 2to n// each diagonal do following fori1to nd+1// do from top to bottom for each diagonal j i+d1 Ni,j infinity fork ito j1// counting minimum Ni,j min(Ni,j,Ni,k +Nk+1,j +di dk+1 dj+1) O(n3)
Exercises • The matrixChainalgorithm only computes #operations of an optimal parenthesization. But, it doesn’t report the optimal parenthesization scheme. Please modify the algorithm so that it can do so. • Given an example with 5 matrices to illustrate your idea using a table.
Lecture 2: Dynamic Programming Sequence Alignment
Question • Given two strings are they similar? what is their distance? and
Example X: applicable Y: plausibly How similar they are? Can you give them a score?
Example applica---ble X’: Indel Match Indel Match Indel Indel Match Indel Indel Indel Match Match Mismatch -p-l--ausibly Y’: Matches Mismatches Insertions & deletions (indel) Three cases:
The values depends on applications. It can be described using a so-called substitution matrix, to be discussed shortly. Example applica---ble X’: Indel Match Indel Match Indel Indel Match Indel Indel Indel Match Match Mismatch -p-l--ausibly Y’: Matches Mismatches Insertions & deletions (indel) (+1) (1) Three cases: (1)
Example applica---ble X’: Score = 5(+1) + 1(1) + 7 (1) = 3 Indel Match Indel Match Indel Indel Match Indel Indel Indel Match Match Mismatch -p-l--ausibly Y’: Matches Mismatches Insertions & deletions (indel) (+1) (1) Three cases: (1)
Example applica---ble X’: Is the alignment optimal? Score = 5(+1) + 1(1) + 7 (1) = 3 Indel Match Indel Match Indel Indel Match Indel Indel Indel Match Match Mismatch -p-l--ausibly Y’: Matches Mismatches Insertions & deletions (indel) (+1) (1) Three cases: (1)
Sequence Alignment • In bioinformatics, a sequence alignment is a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
Global and Local Alignments L G P S S K Q T G K G S - S R I W D N Global alignment L N - I T K S A G K G A I M R L G D A - - - - - - - T G K G - - - - - - - - Local alignment - - - - - - - A G K G - - - - - - - -
L G P S S K Q T G K G S - S R I W D N Global alignment L N - I T K S A G K G A I M R L G D A - - - - - - - T G K G - - - - - - - - Local alignment - - - - - - - A G K G - - - - - - - - Global and Local Alignments
Global and Local Alignments • Global Alignment • attempts to align the entire sequence • most useful when the sequences in the query set are similar and of roughly equal size. • Needleman–Wunsch algorithm (1971). • Local Alignment • Attempts to alignpartial regions of sequences with high level of similarity. • Smith-Waterman algorithm (1981)
Needleman–Wunsch Algorithm • Find the best global alignment of any two sequences under a given substitution matrix. • Maximize a similarity score, to give maximum match • Maximum match = largest number of residues of one sequence that can be matched with another allowing for all possible gaps • Based on dynamic programming • Involves an iterative matrix method of calculation
Substitution Matrix • In bioinformatics, a substitution matrix estimates the rate at which each possible residue in a sequence changes to each other residue over time. • Substitution matrices are usually seen in the context of amino acid or DNA sequence alignment, where the similarity between sequences depends on the mutation rates as represented in the matrix.