150 likes | 276 Views
Phylogenetic Models for Motif Detection. Pradipta Ray ( joint work advised by Eric Xing ). What. Repeating, roughly conserved genetic sequence of biological sequence “Pure” pattern matching techniques give you a large number of false matches : low precision Best modelled as semi supervised.
E N D
Phylogenetic Models for Motif Detection Pradipta Ray ( joint work advised by Eric Xing ) Student Research Symposium, LTI, CMU, 2005
What • Repeating, roughly conserved genetic sequence of biological sequence • “Pure” pattern matching techniques give you a large number of false matches : low precision • Best modelled as semi supervised Student Research Symposium, LTI, CMU, 2005
The Problem Definition • Motif formalism • Positional Weight Matrix • Sequence of 4-nomials • Supervised : Simple MLE for multinomial Student Research Symposium, LTI, CMU, 2005
Traditional Approaches • Unsupervised: • Gibbs Sampling • Expectation Maximization • De novo detection Student Research Symposium, LTI, CMU, 2005
Knowledge is power • Motifs are conserved sites • Given aligned sequences, we may choose suitable regions Student Research Symposium, LTI, CMU, 2005
A look at motif evolution • Work by Krietman et al • Functional evolution : granularity of theevolving unit is larger than the nucleotide Student Research Symposium, LTI, CMU, 2005
Phylogenetic Trees ( T , L , P , M ) T = ( V, E ) = Tree topology L : E REdge lengths P : Multinomial parameters for initial draw M : CTMM parameters Student Research Symposium, LTI, CMU, 2005
Phylogenetic HMM C1 A G G A A P1 Student Research Symposium, LTI, CMU, 2005
The mixture of trees model Student Research Symposium, LTI, CMU, 2005
A G G A A Student Research Symposium, LTI, CMU, 2005
Learning • Learning the evolutionary trees • Learning the annotation trees Student Research Symposium, LTI, CMU, 2005
Inferencing the Tree ? ? ? ? Student Research Symposium, LTI, CMU, 2005
Usage • Framework for answering questions about evolution of macro-entities • A specific and highly significant case would be that of motif finding Student Research Symposium, LTI, CMU, 2005
Conclusion • Currently being implemented • To be tested with data from the Drosophilae species Student Research Symposium, LTI, CMU, 2005