140 likes | 250 Views
Qiong Cheng Georgia State University Joint work with Piotr Berman (Pennstate) Robert Harrison (GSU) Alexander Zelikovsky (GSU). Fast Alignments of Metabolic Networks. Metabolic pathway & pathways model. Metabolic pathways model. 2.7.1.13. 1.1.1.34. 1.1.1.49. 3.1.1.31. 1.1.1.44.
E N D
Qiong Cheng Georgia State University Joint work with Piotr Berman (Pennstate) Robert Harrison (GSU) Alexander Zelikovsky (GSU) Fast Alignments of Metabolic Networks
Metabolic pathway & pathways model Metabolic pathways model 2.7.1.13 1.1.1.34 1.1.1.49 3.1.1.31 1.1.1.44 2.7.1.13 1.1.1.49 1.1.1.34 1.1.1.44 3.1.1.31 A portion of pentose phosphate pathway • Metabolic pathway
Alignments of metabolic pathways match match match Mismatch/Substitute • Pattern P : query pathway Text T : pathway in database • Enzyme similarity and pathway topology together represent the similarity of pathway functionality. • Enzyme Similarity • Pathway topology Similarity
Types of Pathway Alignments Sv in VT Δ(v, fv(v)) f Text • embedding • + enzyme insertions • = edge subdividing • l -fine per insertion • Pinter et al 2005 • + gene duplication and function sharing • = vertex collapsing Pattern • + enzyme deletion • = bypass deletion : send vertex to b • Kelly et al 2005 • + subpath deletion • = strong deletion : send vertex to d • Yang et al 2007 = l Se in ET (|fe(e)|-1)
Optimal Alignment Problem Formulation • Given: • a metabolic pathway P =<VP, EP> (Pattern) and • a metabolic network T =<VT, ET> (Text) • Find minimum cost alignment f : P T • fv : every vertex in VP is mapped to a vertex in VT U {b,d}; • fl : every path lP across vertices in fv-1(VT) is mapped to path lT • Minimize cost(f)=∑u in VP Δ(u, fv(u))+ λ∑l in lP (|fl(l)|-1) • DP solution when pattern is multisource tree • Runtime for DP solution with Fibonacci heaps: • O(|VP|(|ET| + |VT|log|VT|)).
Handling cycles e a b e c d a b c d • DP does not work when pattern has cycles • “Fix” images for some pattern vertices and reduce to acyclic case • Find Minimum Feedback vertex set F(P): • VP-F(P) is acyclic • NP-complete but easy to be approximate • Runtime is increased by factor O(VT |F(P)|) • Total Runtime : O(|VT||F(P)||VP|(|ET| + |VT|log|VT|))
Our software • http://alla.cs.gsu.edu:8080/MinePW/pages/gmapping/GMMain.html
Comparison on different methods Alignment of tree pathways from different species with optimal homomorphism (HM) and optimal network alignment (NA). Average number of mismatches and gaps are reported on common statistically significant matched pathways.
Significant deletion Aspartate superpathway in E. coli Lysine biosynthesis in T. thermophilus Mapping result: unmatched vertices are deleted.
Pathway holes: find and fill • Hole = missing enzyme in pathway description (in database) • Finding holes is difficult task: comparison can help Mapping of formaldehyde oxidation V pathway in B. subtilis to formy1THF biosynthesis pathway in E. coli • Check if there is such enzyme in pattern • Find the closest protein in the same group • If identity is too high > 80% then we expect good filling • Align to previous and next enzyme – the functions may be taken over
Resolving Ambiguity Mapping of glutamate degradation VII pathways from B. subtilis to T. thermophilus (p<0.01). The shaded node reflects enzyme homology.
Future work Improve method of filling pathway holes Discover critical metabolic elements/modules/motifs Describe evolution of metabolic pathways Integrate with genome database
Acknowledgments GSU Molecular Basis of Disease (MBD) fellowship Peter Karp Oleg Rokhlenko Florian Rasche Amit Sabnis, Dipendra Kaur Kelly Westbrooks, Irina Astrovskaya, Stefan Gremalschi, Jingwu He, Dumitru Brinza,Weidong Mao ,Nisar Hudewale