Large Scale Metabolic Network Alignments by Compression

Large Scale Metabolic Network • Alignments by Compression • Michael Dang, FerhatAy, Tamer Kahveci • ACM-BCB 2011 Bioinformatics Lab. University of Florida

NetworkAlignment Bayati et al. ICDM 2009 Ferhat Ay

Metabolic Network Alignment Alignment with Heterogeneous Entities Network Alignment Subnetwork Mappings • Querying Network • databases • Functional Similarity • of Reactions Ferhat Ay

Existing Work • Heymans et al. (2003) – Undirected, Hierarchical Enzyme Similarity • Pinter et al. (2005) – Directed,Only Multi-Source Trees • Singh et al. (2007) – PPI Networks, Sequence Similarity • Dost et al. (2007) – QNET, Color Coding, Tree queries of size at most 9 • Kuchaiev et al. (2010) – GRAAL,Solely Based on Network Topology • Ay et al. (2011) – SubMAP, Considers Subnetwork Mappings • Shih et al. – Next Talk! • Clustering of input networks is necessary for aligning large networks Ferhat Ay

Alignment Phase - SubMAP Ay et. al., RECOMB 2010, JCB 2011 Ferhat Ay

Performance Bottleneck 2 Gigabytes 30 minutes Ferhat Ay

Alignment with Compression Compress Align Refine Ferhat Ay

Alignment with/without compression • Without Compression • With Compression Ferhat Ay

Outline of the method • Compression Phase • Minimum Degree Selection (MDS) method • Optimality analysis • Alignment Phase • Refinement Phase • Overall Complexity • How Much Should We Compress? • Experimental Results Ferhat Ay

Compression Phase Original Network Compressed Network Encapsulated View What Alignment Algorithm Sees Ferhat Ay

MDS – Minimum Degree Selection After After Before Ferhat Ay

Overall Compression Step i Step 1 Step 2 Level 1 ………… Level 2 ………… ………… Level c …….… Ferhat Ay

Optimality Condition for MDS Minimum Degree Node Optimal? Ferhat Ay

Optimality Condition for MDS Optimal? - Can be optimal - At most one node away Minimum Degree Node Ferhat Ay

How Far Away We Are from Optimal Compression? Number of compression steps for the optimal compression and MDS Sizes of the compressed networks for the optimal compression and MDS By the inequality “How far is our compression method from the optimal compression?” Ferhat Ay

Alignment Phase Network 1 Network 2 Compressed Network 1 Compressed Network 2 Network Alignment Algorithm Alignment Ferhat Ay

Refinement Phase Alignment Algorithm Refined Alignment Refine Ferhat Ay

Complexity Analysis Compression Phase: Alignment Phase: Refinement Phase: Overall Complexity (with compression): Complexity of SubMAP (without compression): k = largest subnetwork size c = compression level n, m = sizes of networks Ferhat Ay

How much should we compress? • Examples n=20, m=20, k=2  c ~ 1.37 n=20, m=80, k=1  c ~ 2.11 n=80, m=80, k=2  c ~ 2.15 n=200, m=400, k=1  c ~ 3.11 Ferhat Ay

Experimental Results Ferhat Ay

To compress or not to compress? Ferhat Ay

Compression Rates in Practice KEGG Metabolic Networks with sizes ranging from 10 to 279 Ferhat Ay

What Do We Gain by Compression? Subnetwork size k=1 Subnetwork size k=2 Ferhat Ay

What Do We Lose by Compression? Correlation of mappings scores found by compressed alignment with the ones found by SubMAP Ferhat Ay

Conclusions • We developed a scalable compression technique with optimality bounds. • Our method respects network topology while aligning the networks unlike clustering-based methods. • It provides significant improvement on resource utilization of existing network alignment algorithms. Ferhat Ay

Future Directions • Improving the scale of alignment to genome-wide metabolic networks (without initial clustering). • Evaluating the performance of our compression technique on PPI networks. • Improving the accuracy of compressed alignment w.r.t original alignment for larger levels of compression. • Integrating our compression framework with other existing network alignment methods. Ferhat Ay

Acknowledements Michael Dang NSF IIS-0845439 NSF CCF-0829867 Tamer Kahveci Ferhat Ay

A bit of advertisement Computing Innovation Fellow University of Washington Department of Genome Sciences http://cifellows.org/ http://www.gs.washington.edu/ http://noble.gs.washington.edu/~wnoble/

THANK YOU. QUESTIONS? Ferhat Ay

APPENDIX Ferhat Ay

Ferhat Ay

Large Scale Metabolic Network Alignments by Compression