130 likes | 287 Views
LAGAN and MLAGAN. Brudno et al. Gen. Res. 2003. Presented by Saurabh Sinha Slides courtesy of Rich Leduc and Andra Ivan. LAGAN: Limited Area Global Alignment of Nucleotides MLAGAN: Multi-LAGAN. LAGAN & MLAGAN. Global pair-wise and multiple alignment of “finished” genomic sequences
E N D
LAGAN and MLAGAN Brudno et al. Gen. Res. 2003. Presented by Saurabh Sinha Slides courtesy of Rich Leduc and Andra Ivan
LAGAN: Limited Area Global Alignment of NucleotidesMLAGAN: Multi-LAGAN
LAGAN & MLAGAN • Global pair-wise and multiple alignment of “finished” genomic sequences • Sequences must be known to be orthologous • Multiple alignments of genomic fragments
“LAGAN and MLAGAN assumes that one has already identified apparent orthologous regions between two species, and that there are no genomic rearrangements.”
Outline • LAGAN : globally aligning two orthologous sequences. • Finding local alignment seeds • Constructing a global map • Global alignment • MLAGAN : not discussed today
LAGAN • Dynamic programming too inefficient for whole genome alignment: quadratic time complexity • Therefore, LAGAN is based on “anchors”: • Detect local similarities • Select and fix an ordered set of local similarities, called anchors • Align the interleaving regions • Detecting local similarities based on Smith-Waterman? • No !
4. 1. 2. 3. LAGAN Steps in the global alignment of a pair of sequences
LAGAN1. Find local similarities Seeds: • k-mer with at most c differences between the sequences. GGTGCTTGTA CAGATTATCT (6,2) seed : (GCTTGT, GATTAT)
LAGAN1. Find local similarities (cont’d) • Chaining seeds • x<=d, y<=d, |x-y|<=s • Two seeds can be chained if the above condition holds • A seed is chained to the single previous seed that creates the highes scoring chain among all chains that end with this seed x s1 s2 y s1 s2
4. 1. 2. 3. LAGAN1. Find local similarities (cont’d) • Score of a chain of seeds: • Match scores, mismatch penalties on each pair of characters within seeds • gap penalty |x-y| for each pair of seeds • Several chains found by this method • This completes step 1 of LAGAN (Box 1 in Figure)
LAGAN2. Construct a rough Global Map • Each chain from previous step is a local alignment; each has a score • Chain local alignments such that sum of their scores is maximized • Sparse Dynamic Programming O(n log n) to find the highest scoring chain of local alignments (n = # local alignments) • This is the “rough global map”
4. 1. 2. 3. LAGAN2. Construct a rough Global Map • This step is shown in Fig 1,2
4. 1. 2. 3. LAGAN3. Constructing the global alignment • Use rough global map to limit the area of dynamic programming (Fig 3,4)