190 likes | 304 Views
Toward Automatically Drawn Metabolic Pathway Atlas with Peripheral Node Abstraction Algorithm. Myungha Jang, Arang Rhie , and Hyun- Seok Park * Bioinformatics Laboratory, School of Engineering Ewha Womans University Seoul, Korea. IEEE BIBM, 18-21 Dec 2010, Hong Kong.
E N D
Toward Automatically Drawn Metabolic Pathway Atlas with Peripheral Node Abstraction Algorithm Myungha Jang, ArangRhie, and Hyun-Seok Park* Bioinformatics Laboratory, School of Engineering EwhaWomans University Seoul, Korea IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
Table of Contents Introduction Topological Nature of Metabolic Networks at Peripheral Nodes Node Abstraction Featured Scale-free Algorithm Experimental Results Discussion and Future Work IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
I. INTRODUCTION Automatic graph layout algorithms in systems biology • Abstract graph structure ⇒ visual representation • Graphical diagrams are intuitively helpful to understand biochemical reaction networks • - Node : compound, Edge : reactions • Optimal solutions : NP-hard problems IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
I. INTRODUCTION Focusing on Global Metabolic Pathway • A complete metabolic network indicates all the metabolic potential and capacity. • The shift of research focus: single pathways to multiple pathways. • Visualization serves an important role in understanding large scale metabolic network. • KEGGAtlas(http://www.genome.ad.jp/kegg), 2008 • Terms : Global (metabolic) pathway, Multiple pathway, Atlas IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
I. INTRODUCTION Our Efforts Toward Automatic Global Layout • Not enough to deal with the global pathway! • How can we obtain a complete view? • No attempts for automatic visualization for Atlas IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
I. INTRODUCTION How To Deal With Large-scale Metabolic Pathway? • Related work: KEGG Atlas • The map integration process is carried out manually by curators. • Based on curator’s experience • However, that metabolic networks are dynamic in nature should not be disregarded Systematic approach is necessary IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
INTRODUCTION How To Deal With Large-scale Metabolic Pathway? (con’d) • Our Strategy • We provide a novel algorithmic approach in drawing multiple metabolic pathways by considering two properties: • 1. Automatic abstraction criteria: by analyzing a topological nature of metabolic networks based on the graphical property of relation distance, linear reactions were abstracted as a unit reaction. • 2. the consistency of highly connected nodes
II. Topological Nature of Metabolic Networks at Peripheral Nodes • We obtained 255 map data by parsing KEGG XML (KGML) documents of version 0.6 using our KGML Parser. + KGML Two terms were defined: 1. Relation degree the number of edges branching from a node 2. Relation distance a factor to measure the length between any two compounds encompassing nodes which all have relation degrees less than or equal to p (p = 2) • A dedicated analysis on peripheral nodes with low connectivity was performed. IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
II. Topological Nature of Metabolic Networks at Peripheral Nodes Relation Distance Term Clarification • Definition: The length between any two compounds encompassing nodes which all have relation degrees equal to p • Here, p = 2 IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
II. Topological Nature of Metabolic Networks at Peripheral Nodes Relation Distance Example in Map RD(C01290, C00369) = 7 • cpd:C01291 • cpd:C01290 • cpd:C16466 • cpd:C16475 • cpd:C16468 • cpd:C16470 • cpd:C16471 • cpd:C16469 • cpd:C00369 IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Node Abstraction Featured Scale-free Algorithm Basic Motivation • Observation: 66.83% of the total compounds within the complete metabolic pathways were of low connectivity, with less than relation degree of 3. • The number of compounds with higher relation degree, i.e. more than 6 edges, was much less. Abstracting Compounds With Linear Interaction Layout Components according to High Connectivity IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Node Abstraction Featured Scale-free Algorithm A. Abstracting Compounds With Linear Interaction • We abstracted and hid all those compounds that appear within these linear interactions. • This approach could be called “chain reduction”(M. Chimaniet al) • All green compounds in the figure will be hidden in the graph layout according to this approach. IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Node Abstraction Featured Scale-free Algorithm B. Layout Components according to High Connectivity • Highly Connected Nodes: Nodes with relation degree bigger than 6 • LayoutHighConnectedNode()Algorithm Steps • Find a highly Connected node Nd • Each component connected to Nd is decomposed into sub-graph • Each decomposed sub-graph is treated as a super node to apply the spring-embedding algorithm • Input : Metabolic Pathway Graph • Output : coordinates of each node • voidLayoutPathway(Pathway graph) • { • IF highly connected nodes (Nd) exist in graph • LayoutHighlyConnectedNode(graph, Nd); • ELSEIF any cycle(Nc) exists in graph • AND size of cycle ≥ 6 • LayoutCircular(graph, Nc); • ELSELayoutHierarchic(graph); • } 6 3 IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
IV. Experimental results • Experiments : To compare compression rate of compounds, we obtained the number of abstracted compounds and edge crossings by applying two different layout algorithms: • Result 1 • Node compression rate performance • Scope • 1. 84 single metabolic pathways • 2. 8 major categorized metabolic pathways • 3. the global pathway • Result 2 • The number of edge crossing comparison between by • 1. Conventional algorithm • 2. Our Node abstraction featured scale-free layout algorithm Categorized pathways Global pathway … … single pathways IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Experimental results Peripheral path as supplementary nodes Result 1B The Number of Nodes Before and After Applying Node Abstraction IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Experimental results Peripheral path as super edges Result 1A Original Network Abstracted Network Results drawn with Cytoscape, using conventional spring embedding The red-colored edges represent the abstracted edges. (abstraction rate : 70%) IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
III. Experimental results Result 2 : Edge Crossing Reduction • In single metabolic pathways, the node abstraction featured algorithm reduced edge crossings by 63.31%. • In a global metabolic pathway, the number of edge crossings has reached a reduction of 58.08% in total. • Our proposed algorithm with node abstraction resulted in 86,067 edge crossings, whereas the one without node abstraction resulted in 205,316 edge crossings. IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
IV. Discussion • Two approaches were used: • 1. Abstracting compound pairs according to a consistent criteria • 2. Layout components according to high connectivity • Our experimental results show that node abstraction feature reduced the number of compounds by approximately 23% in global pathway. • Further discussion is necessary regarding enzyme reactions IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University
IV. Why is our work important? • The first systematic approach for Atlas visualization focusing on peripheral nodes • Fundamental to building a hierarchical structure of Atlas • Our approach is flexible upon pathway database change that frequently updates • It is a crucial preliminary step toward automatically drawn metabolic pathway • Future research on individual biological meaning of each peripheral nodes and abstracted path IEEE BIBM, 18-21 Dec 2010, Hong Kong EwhaWomans University