320 likes | 2.05k Views
Sequence Alignment. Global Alignment : compare two sequences in their entirety; the gap penalty is assessed regardless of whether gaps are located internally within a sequence, or at the end of one or both sequences. The Needleman and Wunsch Algorithm
E N D
Sequence Alignment • Global Alignment: compare two sequences in their entirety; the gap penalty is assessed regardless of whether gaps are located internally within a sequence, or at the end of one or both sequences. The Needleman and Wunsch Algorithm • Local Alignment: find best matching subsequences within the two search sequences. The Smith-Waterman Algorithm.
Sequence Alignment • Semi-Global Alignment: different treatment of terminal (end) gaps. Terminal Gaps are usually the result of incomplete data and do not have biological significance. Example: searching the best alignment between the short sequence and entire genome. Modification of Needleman and Wunsch Algorithm.
Algorithm Design Techniques • Exhaustive Search (brute force) algorithm examines every possible alternative to find one particular solution • Dynamic Programming Algorithm breaks the problem into smaller sub-problems and uses the solutions of the sub-problems to construct the solution of the larger problem.
Needleman and Wunsch Algorithm • Input: two strings X = x1…xM and Y = y1…yN and scoring rules: scoring matrix sand gap penalty GP • Output: An alignment of X and Y whose score as defined by scoring rules is maximal among all possible alignments of X and Y
Let F(i,j) = optimal score of aligningx1…xi and y1…yj • Initialization: F(0,0) = 0, F(0, i) = -i, F(j, 0) = -j ( i = 1….M, j = 1….N ) • Main Iteration: For each i = 1….M and j = 1….N • Termination:F(M,N) is an optimal score
Finding the optimal alignment: • Every non-decreasing path from (0, 0) to (M,N) corresponds to an global alignment of the two sequences. • Use TraceBackP starting at (M,N) to trace back an optimal alignment • case 1: xi aligns to yj • case 2: xi aligns to a gap • case 3: yj aligns to a gap
Global Alignment Example • Find the optimal global alignment of AACT and ACG. • Scoring rules: match = 1, mismatch = 0, gap penalty GP = -1 Optimal Alignments: Alignment 1 score = 1 A A C T | | | | - A C G Alignment 2 score = 1 A A C T | | | | A - C G
Smith-Waterman Algorithm • Input: Strings X and Y and scoring rules: scoring matrix s and gap penalty GP. • Output: Substrings of X and Y whose global alignment, as defined by scoring rules is maximal among all global alignments of all substrings of X and Y.
Initialization:F(0,0) = 0, F(0, i) = 0, F(j, 0) = 0 ( i = 1….M, j = 1….N ) • Main Iteration: For each i = 1….M and j = 1….N • Largest value of F(i, j) represents the score of the best local alignment of X and Y • Traceback begins at the highest score in the matrix and continues until you reach 0.
Local Alignment Example • Find the optimal local alignment of AACT and ACG. • Scoring rules match = 1, mismatch = 0, gap penalty GP = -1 • Solution: Local Alignment Score = 2 A C | | A C