1 / 20

Discovering Larger Network Motifs for Biological Insights

Explore methods to identify network motifs (>15 nodes) in biological networks, proposed clustering solution, unsolved issues, and compact representation for motif discovery.

lfox
Download Presentation

Discovering Larger Network Motifs for Biological Insights

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DISCOVERING LARGER NETWORK MOTIFS Wooyoung Kim and Li Chen 4/24/2009 CSC 8910 Analysis of Biological Network, Spring 2009 Dr. Yi Pan

  2. OUTLINE • Project Topic • Related Works • Proposed Ideas • Unsolved Problems

  3. PROJECT TOPIC • Discovering Larger Network Motifs • Given a biological network (PPI, transcriptional regulatory network, gene network, etc), find network motifs whose size is large (>15)

  4. RELATED WORKS (1) • Network Motif Discovery using subgraph enumeration and symmetry breaking • motif size <=15 • Given a candidate subgraph, find all symmetry subgraphs in the graph, then evaluate it by checking the frequency. • Problem: How to find candidate subgraph?  Proposed solution: Cluster the whole network and find the representation at each cluster to claim that as candidate subgraphs.

  5. RELATED WORKS (2) • Motif Discovery Algorithm • Exact algorithm on motifs with a small number of nodes 1. Exhaustive Recursive Search (ERS): (motif size <= 4) 2. ESU: starting with individual nodes and adding one node at a time until the required size k is reached. (motif size <=14) 3. Compact Topological Motifs

  6. RELATED WORKS (3) • Approximate Algorithms • Search Algorithm Based on Sampling (MFINDER) • Rand-ESU • NeMoFINDER • Sub-graph Counting by Scalar Computation • A-priori-based Motif Detection

  7. RELATED WORKS (4) • Network Clustering • Compact representation of network. • Type I: minimum number of clusters • Type II: maximum cohesiveness • Aggregation of topological motifs (combining smaller network motifs to observe the whole structure) However, in our proposed solution, the clustering task is grouping similar network patterns together, not grouping similar nodes (sequence) together. Nor it is not used for aggregating motifs.

  8. PROPOSED IDEAS Given a graph G = (V,E), and t (the size of desirable motif) and k (the number of motifs), find a network motif with size t. • List all graph patterns with t (or larger than t) nodes. • Represent the network as an adjacency matrix A (1, -1, 0) • Scan A for all t x t sub-matrices • Cluster the subgraphs into k clusters • Use any numerical clustering algorithms including K-means, NMF, etc. • Find a subgraph representation at each cluster. • Use the symmetry breaking technique to find the representation. • Each representation can be a candidate of network motif.

  9. UNSOLVED PROBLEMS • How to cluster the graphs? • The clustering algorithms to apply will be various based on what features we are using for the data. • What type of clustering algorithm? Type I or type II? • How to find the representation subgraph of each cluster? • Should we consider network alignment first? • Should we consider the sequence similarities as well? • Will there be any relationship between sequence motif and network motif? • Applying the sequence motif into vertex attributes matrix? compact topological motifs. • Large network motif vs. small network motif

  10. DISCOVERING TOPOLOGICAL MOTIFS USING A COMPACT NOTATION

  11. COMPACT NOTATION • Main Idea A topological motif can be represented either as a motif or as a collection of location lists of the vertices of the motif. It works in the space of the location lists so as to discover motif.

  12. COMPACT NOTATION • Method • Step1: compute an exhaustive list of potential lists of vertices of motifs as compact location lists • Step 2: enlarge the collection of compact location lists computed in the first step by including all the non-empty intersections, along with the differences.

  13. COMPACT NOTATION • An Example Different color indicate different attribute

  14. COMPACT NOTATION • G1’s adjacency matrices

  15. COMPACT NOTATION • Adjacency Matrix B1 (the conjugacy relationship of two lists is shown by “”) • L = {ℓ1, ℓ2, ℓ3, ℓ4}

  16. COMPACT NOTATION • Initialization Step

  17. COMPACT NOTATION • Iterative Step

  18. REFERENCES • [1] Bill Andreopoulos, Aijun An, Xiaogang Wang, and Michael Schroeder. A roadmap of clustering algorithms: finding a match for a biomedical application. Brief Bioinform, pages bbn058+, February 2009. • [2] Alberto Apostolico, Matteo Comin, and Laxmi Parida". Bridging Lossy and Lossless Compression by Motif Pattern Discovery. Electronic Notes in Discrete Mathematics, 21:219 - 225, 2005. General Theory of Information Transfer and Combinatorics. • [3] Giovanni Ciriello and Concettina Guerra. A review on models and algorithms for motif discovery in protein-protein interaction networks. Brief Funct Genomic Proteomic, 7(2):147-156, 2008. • [4] Jun Huan, Wei Wang, and Jan Prins. Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. Data Mining, IEEE International Conference on, 0:549, 2003. • [5] Michihiro Kuramochi and George Karypis. Finding Frequent Patterns in a Large Sparse Graph. Data Mining and Knowledge Discovery, 11(3):243-271, November 2005. • [6] Laxmi Parida. Discovering Topological Motifs Using a Compact Notation. Journal of Computational Biology, 14(3):300-323, 2007.

  19. REFERENCES • [7] Radu Dobrin, Qasim K. Beg, Albert-Laszlo Barabasi, and Zoltan N. Oltvai. Aggregation of topological motifs in the escherichia coli transcriptional regulatory network. BMC Bioinformatics, 5:10, 2004. • [8] McKay, B.D. Isomorph-free exhaustive generation. J. Algorithms, 26:306-324, 1998 • [9] Middendorf, M., Zive, E., and Wiggins, C.H. Inferring network mechanisms: the Drosophila melanogaster protein interaction network. PNAS, 102 (9):3192-3197, Mar 2005. • [10]Grochow, J. A. and Kellis, M. Network motif discovery using subgraph enumeration and symmetry-breaking. In RECOMB 2007, Lecture Notes in Computer Science 4453, pp. 92-106. Springer-Verlag, 2007.

  20. Thank you so much !

More Related