320 likes | 421 Views
Ferhat Ay , Tamer Kahveci & Valerie de-Crecy Lagard. Consistent alignment of metabolic pathways without abstraction. www.cise.ufl.edu/~fay. Metabolic Pathways. What and Why?. C2. C4. C1. R1. R2. C3. C5. E1. E2. Metabolic Pathway Alignment
E N D
Ferhat Ay, Tamer Kahveci & Valerie de-Crecy Lagard Consistent alignment of metabolic pathways without abstraction www.cise.ufl.edu/~fay Ferhat Ay
Metabolic Pathways Ferhat Ay
What and Why? C2 C4 C1 R1 R2 C3 C5 E1 E2 Metabolic Pathway Alignment Finding a mapping of the entities of the pathways Applications • Drug Target Identification • Metabolic Reconstruction • Phylogeny Prediction C2 C4 C1 R1 R2 E1 E2 C5 Ferhat Ay
Challanges Abstraction Graph Alignment • Even after Abstraction Metabolic Pathway Alignment problemis NP Complete! • Existing Algorithms • Heymans et al. (2003) • Clemente et al. (2005) • Pinter et al. (2005) • Singh et al. (2007) • …. E1 E2 E3 E1 E2 E3 C1 C3 C2 C4 E4 E4 E1 E2 E3 E1 E2 E3 C1 C3 - Where are the compounds? • Pathway Alignment is hard ! - E1 C1 E2 or E1 C2 E2 ? • Abstraction is a problem ! Ferhat Ay
Outline • Graph Model of Pathways • Consistency of an Alignment • Homological & Topological Similarities • Eigenvalue Problem • Similarity Score • Experimental Results Ferhat Ay
Non-Redundant Graph Model ThPP Lip-E Pyruv. 1.8.1.4 R3270 R7618 R0014 1.2.4.1 A-CoA 2-ThP S-Ac Di-hy R2569 2.3.1.12 Ferhat Ay
Consistency 1- Align only the entities of the same type (compatible) R1 R2 C1 C2 C1 R1 • 2- The overall mapping should be 1-1 R2 R1 R3 Ferhat Ay
Consistency 3- Align two entities ui, vionly if there exists an aligned entity pair uj, vjsuch that ujand vj are on the reachability paths of uiand virespectively. C3 C5 C1 R1 R2 C2 C4 Aligned Entities Backward Reachability Path C2 C4 C1 R1 R2 Forward Reachability Path C5 Ferhat Ay
Problem Statement Given a pair of metabolic pathways, our aim is to find the consistent alignment (mapping) of the entities (enzymes, reactions, compounds) such that the similarity between the pathways (SimP score) is maximized. Ferhat Ay
Pairwise Similarities (Homology of Entities) Ferhat Ay
Pairwise Similarities (Homology) • Enzyme Similarity (SimE) • Hierarchical Enzyme Similarity - Webb EC.(2002) • Information-Content Enzyme Similarity - Pinter et al.(2005) • Compound Similarity (SimC) • Identity Score for compounds • SIMCOMP Compound Similarity – Hattori et al.(2003) Ferhat Ay
Pairwise Similarities SimR (R1,R2) = Enzymes max ( SimE (E1,E3) , SimC (E2,E3) ) Input Compounds + max ( SimC (C1,C4) , SimC (C2,C4) ) Output Compounds + max ( SimC (C3,C5) , SimC (C3,C6), SimC (C3,C7) ) • Reaction Similarity (SimR) C1 C3 R1 C2 E1 E2 C5 C4 R2 C6 C7 E3 Ferhat Ay
Topological Similarity (Topology of Pathways) Ferhat Ay
Neighborhood Graphs C1 R1 C4 C8 C2 C6 E1 R3 R4 C9 C3 C5 C7 E3 R2 E2 Reactions Enzymes Compounds C1 R1 C4 C6 C8 R3 R4 E1 E2 E3 C2 R2 C5 C7 C9 C3 Ferhat Ay
Topological Similarities |R| = 4 BN (R3)= {R1,R2} FN (R3)= {R4} BN (R3)= {R1} FN (R3)= {R4,R5} R1 R3 R4 AR [R3 ,R3][R2,R1] = 1 = 1 2*1 + 1*2 4 R2 R4 (|R| |R| ) x (|R| |R| ) = 16 x 16 AR matrix R1 R3 R5 |R| = 4 Ferhat Ay
Problem Formulation Iteration 1: Support of aligned first degree neighbors added Iteration 2: Support of aligned second degree neighbors added Iteration 3: Support of aligned third degree neighbors added Iteration 0: Only pairwise similarity of R3 and R3 R1 R4 R6 R1 R3 R3 R2 R8 R2 R5 R7 R8 R5 R7 Focus on R3 – R3 matching Ferhat Ay
Problem Formulation Initial Reaction Similarity Matrix HR0Vector HRsVector Final Reaction Similarity Matrix 0.5 1.0 0.4 0.3 0.6 0.9 0.5 0.5 0.6 0.9 0.5 0.5 0.6 0.9 0.5 0.5 Power Method Iterations 0.5 1.0 0.4 0.3 0.5 1.0 0.4 0.3 0.3 0.5 0.8 0.8 0.3 0.5 0.8 0.8 0.1 1.0 0.2 0.9 0.1 1.0 0.2 0.9 0.3 0.5 0.8 0.8 0.2 0.3 0.6 0.9 0.2 0.3 0.6 0.9 0.2 0.3 0.6 0.9 0.1 1.0 0.2 0.9 0.2 1.0 0.4 0.6 0.2 1.0 0.4 0.6 0.2 1.0 0.4 0.6 Ferhat Ay
Max Weight Bipartite Matching • Six Possible Orderings • ONLY 3 ARE UNIQUE • Reactions First • Enzymes First • Compounds First • R First Pruning Consistency Assured ! Weighted Edges Aligned Entities C1 E1 C1 Inconsistent Edges R1 R1 E1 C2 C2 E2 R2 R2 E2 C3 C3 R3 R3 E3 C4 Ferhat Ay
Alignment Score ( SimP ) 0 =< SimP <= 1 SimP =1 for identical pathways SimP= bSim(C1) + Sim(C2) +Sim(C4) + (1 – b)Sim(E1) + Sim(E2) 3 2 C2 C4 C1 R1 R2 C3 C5 C2 C4 C1 R1 R2 E1 E2 C5 E1 E2 Ferhat Ay
Outline • Graph Model of Pathways • Consistency of an Alignment • Homological & Topological Similarities • Eigenvalue Problem • Similarity Score • Experimental Results Ferhat Ay
Impact of Alpha • = 0 : Only pairwise similarities of entities - No iterations • = 1 : Only topology of the graphs a = 0.7 is good ! Ferhat Ay
Alternative Entities & Paths Kim J. et al. (2007) Kuzuyama T. et al. (2006) Eukaryotes(e.g. H.Sapiens) Mevalonate Path Bacterias (e.g. E.Coli) Non-Mevalonate Path Ferhat Ay
Phylogeny Prediction Deuterostomia Thermoprotei Archaea Our Prediction NCBI Taxonomy Eukaryota Ferhat Ay
Effect Of Consistency Restriction Ferhat Ay
Running Time Ferhat Ay
Thank YOu For source code and more information: www.cise.ufl.edu/~fay Ferhat Ay
Appendix Ferhat Ay
Error Tolerance Ferhat Ay
Pylogenetic Reconstruction Ferhat Ay
Effect Of Consistency Restriction Ferhat Ay
Z-Score Calculation Ferhat Ay
Challanges NO Abstraction Abstraction Pathway 1 Pathway 1 Abstracted E1 E2 E3 E1 E2 E3 C1 C3 C2 C4 E4 E4 • Alignment Problem is NP Complete ! Pathway 2 Pathway 2 Abstracted E1 E2 E3 E1 E2 E3 C1 C3 - Where are the compounds? - E1 C1 E2 or E1 C2 E2 ? Abstraction is a Problem! Ferhat Ay