1 / 20

Input Sensitive Algorithms for Multiple Sequence Alignment

Input Sensitive Algorithms for Multiple Sequence Alignment. Pankaj Agarwal @Duke Yonatan Bilu @Hebrew University Rachel Kolodny @Stanford. Multiple Sequence Alignment. Quantifies similarities among [DNA, Protein] sequences Detects highly conserved motifs & remote homologues

spence
Download Presentation

Input Sensitive Algorithms for Multiple Sequence Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Input Sensitive Algorithms for Multiple Sequence Alignment Pankaj Agarwal @Duke Yonatan Bilu @Hebrew University Rachel Kolodny @Stanford

  2. Multiple Sequence Alignment • Quantifies similarities among [DNA, Protein] sequences • Detects highly conserved motifs & remote homologues • Evolutionary insights • Transfer of annotation • Representation of protein families

  3. (1) GARFIELD MET NERMAL(2) ODIE AND HIS ASSOCIATE NERMAL MET GARFIELD AND HIS ASSOCIATE(3) GARFIELD AND HIS ASSOCIATE NERMAL ----GARFIELD MET----------------- NERMAL ------------------------------ODIE------------AND HIS ASSOCIATE NERMAL MET GARFIELD AND HIS ASSOCIATE----GARFIELD ---AND HIS ASSOCIATE NERMAL ------------------------------ Multiple Sequence Alignment • Input: k sequences • Output: optimal alignment • Gap infused sequences (-), one per row. • Restrictions column pattern

  4. Multiple Sequence Alignment • Input: k sequences • Output: optimal alignment • Minimal width • Score function • Columns summation • e.g. sum of pairs (1) GARFIELD MET NERMAL(2) ODIE AND HIS ASSOCIATE NERMAL MET GARFIELD AND HIS ASSOCIATE(3) GARFIELD AND HIS ASSOCIATE NERMAL ----GARFIELD MET----------------- NERMAL ------------------------------ODIE------------AND HIS ASSOCIATE NERMAL MET GARFIELD AND HIS ASSOCIATE----GARFIELD ---AND HIS ASSOCIATE NERMAL ------------------------------

  5. GARFIELDMETNERMAL GARFIELDANDHISASSOCIATENERMAL num of nodes num neighbors per node DP solves MSA • Build a score matrix • k-dimensional hypercube • An alignment is a path • Time: GARFIELDMET---------------NERMAL GARFIELD---ANDHISASSOCIATENERMAL

  6. Previous Work

  7. GARFIELDANDHISASSOCIATENERMAL Pairwise Restriction • The “true” information: the aligned subsequences and their relative positioning • Study pairwise alignment first and restrict the alignment • Time: • Focus efforts on “true” tradeoffs GARFIELDMETNERMAL

  8. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE nodes • Edges: • self edges • between 2-equal-lengths-segments of different sequences • have scores GARFIELD NERMAL ANDHISASSOCIATE GARFIELD MET NERMAL Segments Matching Graph (SMG) • Sequences are partitioned into segments Defines allowed paths and their score

  9. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELDANDHISASSOCIATENERMAL ODIEANDHISASSOCIATENERMALMETGARFIELDANDHISASSOCIATE GARFIELD ANDHISASSOCIATE NERMAL

  10. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELDANDHISASSOCIATENERMAL ODIEANDHISASSOCIATENERMALMETGARFIELDANDHISASSOCIATE GARFIELD ANDHISASSOCIATE NERMAL

  11. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELDANDHISASSOCIATENERMAL ODIEANDHISASSOCIATENERMALMETGARFIELDANDHISASSOCIATE GARFIELD ANDHISASSOCIATE NERMAL Extreme paths:

  12. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELDANDHISASSOCIATENERMAL ODIEANDHISASSOCIATENERMALMETGARFIELDANDHISASSOCIATE GARFIELD ANDHISASSOCIATE NERMAL Extreme paths:

  13. Lemma: there is an optimal path that is extreme Optimalpaths All paths Extreme paths

  14. GARFIELDANDHISASSOCIATENERMAL ODIEANDHISASSOCIATENERMALMETGARFIELDANDHISASSOCIATE Improved algorithm: DP on the segments

  15. Transitive PR-MSA DNA sequences *no scores in SMG, only matches

  16. Maximal Directions • Transitivity implies that for any point in the hypercube, the directions are partitioned into cliques • Defines maximal directions • The shortest path can be taken over maximal directions. • Pushes down the work per node

  17. ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELD NERMAL ANDHISASSOCIATE ? ODIE ANDHISASSOCIATE NERMAL MET GARFIELD ANDHISASSOCIATE GARFIELD NERMAL ANDHISASSOCIATE GARFIELD GARFIELD MET MET NERMAL NERMAL Obvious Directions Obvious: Non-Obvious:

  18. Obvious Directions • Lemma:Optimal pathis found, evenwhen making obvious decisions • Not all nodes are relevant • Work for every node increases to

  19. Straightjunction Corner junction (0,0) Special Vertices

  20. Thank you

More Related