1 / 42

Multiple Sequence Alignments

Multiple Sequence Alignments. Reading. Durbin’s book: Chapter 6.1-6.4 Gusfield’s book: Chapter 14.1, 14.2, 14.5, 14.6.1 Papers: avid, lagan Optional: All of Gusfield chapter 14, Papers: tcoffee, slagan, scl. Definition. Given N sequences x 1 , x 2 ,…, x N :

garin
Download Presentation

Multiple Sequence Alignments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Sequence Alignments

  2. Reading Durbin’s book: Chapter 6.1-6.4 Gusfield’s book: Chapter 14.1, 14.2, 14.5, 14.6.1 Papers: avid, lagan Optional: All of Gusfield chapter 14, Papers: tcoffee, slagan, scl Lecture 12, Tuesday May 13, 2003

  3. Definition Given N sequences x1, x2,…, xN: • Insert gaps (-) in each sequence xi, such that • All sequences have the same length L • Score of the global map is maximum The sum-of-pairs score of an alignment is the sum of the scores of all induced pairwise alignments S(m) = k<l s(mk, ml) s(mk, ml): score of induced alignment (k,l) Lecture 12, Tuesday May 13, 2003

  4. Consensus -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC CAG-CTATCAC--GACCGC----TCGATTTGCTCGAC CAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC • Find optimal consensus string m* to maximize S(m) = i s(m*, mi) s(mk, ml): score of pairwise alignment (k,l) Lecture 12, Tuesday May 13, 2003

  5. Multiple Sequence Alignments Algorithms

  6. Multidimensional Dynamic Programming • Example: in 3D (three sequences): • 7 neighbors/cell F(i,j,k) = max{ F(i-1,j-1,k-1)+S(xi, xj, xk), F(i-1,j-1,k )+S(xi, xj, - ), F(i-1,j ,k-1)+S(xi, -, xk), F(i-1,j ,k )+S(xi, -, - ), F(i ,j-1,k-1)+S( -, xj, xk), F(i ,j-1,k )+S( -, xj, xk), F(i ,j ,k-1)+S( -, -, xk) } Lecture 12, Tuesday May 13, 2003

  7. Progressive Alignment • Multiple Alignment is NP-complete • Most used heuristic: Progressive Alignment Algorithm: • Align two of the sequences xi, xj • Fix that alignment • Align a third sequence xk to the alignment xi,xj • Repeat until all sequences are aligned Running Time: O( N L2 ) Lecture 12, Tuesday May 13, 2003

  8. Progressive Alignment x • When evolutionary tree is known: • Align closest first, in the order of the tree Example: Order of alignments: 1. (x,y) 2. (z,w) 3. (xy, zw) y z w Lecture 12, Tuesday May 13, 2003

  9. Progressive Alignment: CLUSTALW CLUSTALW: most popular multiple protein alignment Algorithm: • Find all dij: alignment dist (xi, xj) • Construct a tree (Neighbor-joining hierarchical clustering) • Align nodes in order of decreasing similarity + a large number of heuristics Lecture 12, Tuesday May 13, 2003

  10. CLUSTALW & the CINEMA viewer Lecture 12, Tuesday May 13, 2003

  11. Iterative Refinement One problem of progressive alignment: • Initial alignments are “frozen” even when new evidence comes Example: x: GAAGTT y: GAC-TT z: GAACTG w: GTACTG Frozen! Now clear correct y = GA-CTT Lecture 12, Tuesday May 13, 2003

  12. Iterative Refinement Algorithm (Barton-Stenberg): • Align most similar xi, xj • Align xk most similar to (xixj) • Repeat 2 until (x1…xN) are aligned • For j = 1 to N, Remove xj, and realign to x1…xj-1xj+1…xN • Repeat 4 until convergence Note: Guaranteed to converge Lecture 12, Tuesday May 13, 2003

  13. 2. Iterative Refinement (cont’d) allow y to vary x,z fixed projection For each sequence y • Remove y • Realign y (while rest fixed) z x y Lecture 12, Tuesday May 13, 2003

  14. Iterative Refinement Example: align (x,y), (z,w), (xy, zw): x: GAAGTTA y: GAC-TTA z: GAACTGA w: GTACTGA After realigning y: x: GAAGTTA y: G-ACTTA + 3 matches z: GAACTGA w: GTACTGA Lecture 12, Tuesday May 13, 2003

  15. Iterative Refinement Example not handled well: x: GAAGTTA y1: GAC-TTA y2: GAC-TTA y3: GAC-TTA z: GAACTGA w: GTACTGA • Realigning any single yi changes nothing Lecture 12, Tuesday May 13, 2003

  16. Restricted MDP • Here is a final way to improve a multiple alignment: • Construct progressive multiple alignment m • Run MDP, restricted to radius R from m Running Time: O(2N RN-1 L) Lecture 12, Tuesday May 13, 2003

  17. 1. Restricted MDP • Run MDP, restricted to radius R from m z x y Running Time: O(2N RN-1 L) Lecture 12, Tuesday May 13, 2003

  18. Restricted MDP (2) x: GAAGTTA y1: GAC-TTA y2: GAC-TTA y3: GAC-TTA z: GAACTGA w: GTACTGA • Within radius 1 of the optimal  Restricted MDP will fix it. Lecture 12, Tuesday May 13, 2003

  19. MLAGAN: Multiple Alignment Multi-Anchoring Progressive Alignment Iterative Refinement

  20. 1. Multi-anchoring To anchor the (X/Y), and (Z) alignments: X Z Y Z X/Y Z Lecture 12, Tuesday May 13, 2003

  21. 2. Progressive Alignment Human Given N sequences, phylogenetic tree Align pairwise, in order of the tree (LAGAN) Baboon Mouse Rat Lecture 12, Tuesday May 13, 2003

  22. 3. Iterative Refinement x,z fixed projection For each sequence y • Remove y • Anchor “good” spots • Realign y using LAGAN z x y Lecture 12, Tuesday May 13, 2003

  23. Cystic Fibrosis (CFTR), 12 speciesThe “zoo” project • Human sequence length: 1.8 Mb • Total genomic sequence: 13 Mb Chicken Zebrafish Cow Pig Chimp Human Dog Rat Fugu Cat Baboon Mouse Lecture 12, Tuesday May 13, 2003

  24. Performance in the CFTR region Lecture 12, Tuesday May 13, 2003

  25. Alignment & Rearrangements

  26. Evolution at the DNA level Deletion Mutation …ACGGTGCAGTTACCA… SEQUENCE EDITS …AC----CAGTCCACCA… REARRANGEMENTS Inversion Translocation Duplication Lecture 12, Tuesday May 13, 2003

  27. Local & Global Alignment AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC Global Local AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC Lecture 12, Tuesday May 13, 2003

  28. Glocal Alignment Problem Find least cost transformation of one sequence into another using new operations AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA • Sequence edits • Inversions • Translocations • Duplications • Combinations of above AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC Lecture 12, Tuesday May 13, 2003

  29. S-LAGAN: Find Local Alignments • Find Local Alignments • Build Rough Homology Map • Globally Align Consistent Parts Lecture 12, Tuesday May 13, 2003

  30. S-LAGAN: Build Homology Map • Find Local Alignments • Build Rough Homology Map • Globally Align Consistent Parts Lecture 12, Tuesday May 13, 2003

  31. Building the Homology Map b a d c Chain (using Eppstein Galil); each alignment gets a score which is MAX over 4 possible chains. Penalties are affine (event and distance components) • Penalties: • regular • translocation c) inversion d) inverted translocation Lecture 12, Tuesday May 13, 2003

  32. S-LAGAN: Build Homology Map • Find Local Alignments • Build Rough Homology Map • Globally Align Consistent Parts Lecture 12, Tuesday May 13, 2003

  33. S-LAGAN: Global Alignment • Find Local Alignments • Build Rough Homology Map • Globally Align Consistent Fragments Lecture 12, Tuesday May 13, 2003

  34. S-LAGAN alignments Local Glocal Lecture 12, Tuesday May 13, 2003

  35. S-LAGAN alignments Hum/Mus Hum/Rat Lecture 12, Tuesday May 13, 2003

  36. S-LAGAN alignments (Chr 20) • Human Chr 20 v. homologous Mouse Chr 2. • 270 Segments of conserved synteny • 70 Inversions Lecture 12, Tuesday May 13, 2003

  37. Some more examples Hum/Rat Hum/Mus Lecture 12, Tuesday May 13, 2003

  38. Some more examples Hum/Rat Hum/Mus Lecture 12, Tuesday May 13, 2003

  39. Some more examples Hum/Rat Hum/Mus Lecture 12, Tuesday May 13, 2003

  40. Some more examples Hum/Rat Hum/Mus Lecture 12, Tuesday May 13, 2003

  41. Some more examples Hum/Mus Hum/Rat Lecture 12, Tuesday May 13, 2003

  42. Some more examples Hum/Rat Hum/Mus Lecture 12, Tuesday May 13, 2003

More Related