320 likes | 514 Views
Dynamic Networks: How Networks Change with Time?. Vahid Mirjalili CSE 891. Overview. Introduction Methodology DHAC : clustering in a single snapshot MATH-EM: Cluster matching in different time frames Results Discussion Further improvement. Motivation.
E N D
Dynamic Networks:How Networks Change with Time? VahidMirjalili CSE 891
Overview • Introduction • Methodology • DHAC: clustering in a single snapshot • MATH-EM: Cluster matching in different time frames • Results • Discussion • Further improvement
Motivation • To infer the dynamic state of a cell in response to physiological changes • Two algorithms used: • DHAC: Dynamic Hierarchal Agglomerative Clustering for clustering time-evolving networks • MATH-EM: for matching corresponding clusters across time-points
Background • Current biological networks are static • Experimental methods: • Protein abundance (mass spec.) (mainly available for high abundant proteins) • Transcript abundance (more readily available) • Previous works: combining transcript abundance and interaction networks to create a moving cell
Dynamic Networks • Probabilistic framework • The number of proteins can increase or decrease at each time-point • Protein can switch interacting partners • Complexes can grow/shrink • Reveals temporal regulation of cell protein state
HAC: Hierarchal Agglomerative Clustering • Agglomerative = “bottom up” approach • Divisive = “top down” approach
HAC Features • Maximizes the likelihood of a hierarchal stochastic block model • Automatic selection of model size • Multi-scale networks • Outperforms other methods in link prediction • Extending HAC to dynamic networks: • How complexes inferred at one time point correspond to other time points • Transitions of a protein require dynamic coupling between network snapshots
DHAC: • Converting likelihood modularity from maximum likelihood to fully Bayesian statistics • Kernelize likelihood modularity with an adaptive bandwidth to couple network clusters at different time points
Dynamic Network Clustering {G(t) = (V(t), E(t)), t= 1 .. T} V: proteins E: (undirected, unweighted) protein-protein interactions • Goal: find the stochastic block models • {M(t) t=1 .. T} M(t): network generative model for G(t) • Introducing coupling between time points improves dynamic network clustering
DHAC: notations • probability of a structure model M • The probability that a vertex is in cluster k
Merging Clusters • To merging clusters 1 &2 into 1’: • Maximum likelihood • Bayesian
Kernelization • Kernel reweighting: to couple nearby snapshots
DHAC Algorithm for t=1:T do • Set each vertex to be a single cluster • Let be cumulative model comparison score • Compute merging scores of pairs having an edge or a shared neighbor • repeat • Pick a pair i,j of maximum • Update scores of affected pairs after merging i,j • Merge i,j to i' • Compute merging scores i',jfor all jwith or • Update • until no pairs left • output at which was maximum end for
Cluster Matching Algorithm • Searching through time-frames to see how complexes evolve • Goal: to find the most probable matching of cluster i to a global index k
Results Drosophila development (gene expression data available) DHAC-local: variable bandwidth DHAC-const: constant bandwidth
Yeast Results • Yeast results identify protein complexes with asynchronous gene expression • 31 dynamic protein complexes were recovered • Many of the complexes have cluster-specific gene-ontology with P-value<0.05 • Some of the complexes disappear and then reappear across time-points
Discussion • DHAC scales as O(EJ ln(V)) • Networks with 2000 vertices take up to 5 min. • A full genome network (10000 to 100000 vertices) can be analyzed in a day or a week • This methods permits proteins to switch between complexes over time • A natural multi-scale complexes, sub-complexes and proteins
Further improvement • Information from pathway to complex to sub-complex to finer structures could be used • Lack a method to match the dynamically evolving hierarchical structures over snapshots • They only focused on the bottom level complexes, rather than the hierarchical structure
MATCH-EM • Goal: Match similar groups across time-points • Find the mapping of each cluster to a global index There is one and only one global index for cluster i The assignment matrix The probability that vertex u is in global index k
The matching probability under consistent indexing Number of shared vertices between cluster i at time t, and cluster j at time t+1 Probability that a vertex can make a transition from k to k’ between two consecutive snapshots
Experimental Data • Combining Gene expression time series with static protein interaction networks • The presence of a protein is assumed to be related to the transcriptional abundance of the corresponding transcript at a nearby time • N x T matrix: transcription levels of N genes across T time points • The dynamics of the networks is generated from the transcription matrix, under the assuming that proteins in a complex have correlated gene expression profiles
Results: Held-out link prediction • Randomly select two vertices, and remove the edge • After clustering, vertex u is assigned to group i, and vertex v to cluster j • The maximum likelihood probability that u-v were connected:
AUPRC: area under the curve of Precision-Recall-Curve AUROC: area under the curve of receiver-operating-characteristics (generated by true-positive-rate and false-positive-rate)
Yeast Metabolic Cycle • Three dominant metabolic states: • Reductive Building: 977 genes RB • Reductive Charging: 1510 genes RC • Oxidative: 1023 genes OX • 36 snapshots • Preprocessing: iterative degree cutoff, reducing the number of proteins from 1380 to 480±14
Macro-view of YMC RC phase OX phase RB phase
Micro-views of YMC dynamics Cluster #7: mitochondrial ribosome complex • RSMs: ribosomal small subunits of mitochondria • MRPs: mitochondrial ribosomal proteins • RSM22 is active at t=9, 20 & 32, while other proteins are not transcribed • Methylation of 3’-end of rRNA of small mitochondrial subunit is requred for the assembly and stability of mitochindrial ribosome • Deleting RSM22 yields a viable cell with non-functional mitochondria • Hypothesis Early expression of RSM22 provide the methylation activity required for the assembly of small sub-units of mitochondrial ribosome
Cluster #7: mitochondrial ribosomal complex Average expression levels during the three main phases
Micro-views of YMC dynamics Cluster #16: nuclear pore • Active at t=9, 20 & 32 • Most genes are OX-responsive • Combines with subunits of other complexes • The co-expressed cores: • Nuclear pore complex (NPC) • Karyopherin proteins (KAP)
Cluster #16: nuclear pore complex During OX phase, SRP1 and SXM1 Are additionally recruited
What we learned from YMC? • RRP4 and RRP42 are part of exosome that edit RNA molecules, they transition between the nuclear pore and other complexes • RNA processing is tightly coupled to transport through the nuclear pore to cytoplasm • Dynamic reorganization of the nuclear pore occurs during the metabolic cycle