Modularity in Biological networks

Modularity in Biological networks

Traditional view of modularity: Modularity in Cellular Networks • Hypothesis: Biological function are carried by discrete functional modules. • Hartwell, L.-H., Hopfield, J. J., Leibler, S., & Murray, A. W., Nature, 1999. • Question: Is modularity a myth, or a structural property of biological networks? (are biological networks fundamentally modular?)

Modularity in cell biology

Definition of a module • Loosely linked island of densely connected nodes • Groups of co-expressed genes

Concept of modules in a network

Definition of a module

Computational analysis of modular structuresData clustering approach

Concept of data clustering analysis • Partitioning a data set into groups so that points in one group are similar to each other and are as different as possible from the points in other groups. • The validity of a clustering is often in the eye of beholder.

Concept of data clustering analysis • In order to describe two data points are similar or not, we need to define a similarity measure. • We also need a score function for our objectives. • A clustering algorithm can be used to partition the data set with optimized score function.

Types of clustering algorithms • Partition-based clustering algorithms • Hierarchical clustering algorithms • Probabilistic model-based clustering algorithms

Partitioning problem • Given the set of n nodes network D={x(1),x(2),∙∙∙,x(n)}, our task is to find K clusters C={C1,C2,∙∙∙,CK} such that each node x(i) is assigned to a unique cluster Ck with optimized score function S(C1,C2,∙∙∙,CK).

Community structure of biological network Community 2 Community 1 Community 3

Score function for network clustering • To maximize the intra group connections as many as possible and to minimize the inter group connection as few as possible.

Spectral analysis clustering algorithm

Adjacency Matrix • Aij= 1 if ith protein interacts with jth protein • Aij=0 otherwise • Aij=Aji (undirected graph) • Aij is a sparse matrix, most elements of Aij are zero

Spectral analysis

Algorithm (Spectral analysis) • Randomly assign a vector X=(X1,X2,…,Xn) • Iterate X(k+1)=AX(k) untill it converges • Try another vector which is perpendicular to previous found eigenspace

Topological Structure Hidden Topological Structure Original Network

An example Protein-protein interaction network of Saccharomyces cerevisiae

Data source Assign 80000 interactions of 5400 yeast proteins a confidence value We take 11855 interactions with high and medium confidence among 2617 proteins with 353 unknown function proteins.

Quasi-bipartite Quasi-clique negative eigenvalue Positive eigenvalue

With the spectral analysis, we obtain 48 quasi-cliques and 6 quasi-bipartites. • There are annotated proteins, unannotated and unknown proteins within a quasi-clique

Application—function prediction

Hierarchical clustering algorithm • A similarity distance measure between node i and j, d(i,j) • The similarity measure can be let the network to be a weighted network Wij.

Types of hierarchical clustering • Agglomerative hierarchical clustering • Divisive hierarchical clustering

Properties of similarity measure • d(i,j)≥0 • d(i,j)=d(j,i) • d(i,j)≤d(i,k)+d(k,j)

Similarity measure for agglomerative clustering • Correlation • Shortest path length • Edge betweenness

How good is agglomerative clustering ?

Hierarchical tree (Dendrogram) threshold

Distance between clusters Cluster 2 Cluster 1 Single link

Distance between clusters Cluster 2 Cluster 1 Complete link

Single link 1.5 2.0 2.2 3.5 x2 x3 x1 x4 x5

Divisive hierarchical clustering M.E.J., Newman and M. Girvan, Phys. Rev. E 69, 026113, (2004)

Definition of edge betweeness

Calculation of edge betweenness

Quantitative measurement of network modularity Modularity Q

Threshold selection

Karate club network

Examples of agglomerative hierarchical clustering

Can we identify the modules? J(i,j): # of nodes both i and j link to; +1 if there is a direct (i,j) link

Modules in the E. coli metabolism E. Ravasz et al., Science, 2002 Pyrimidine metabolism

Yeast signaling proteins in MIPS PNAS, vol.100, pp.1128, (2003).

Spotted microarray for Saccharomyces cerevisiae Similarity measure

Regulatory module network

Genome Biology, 9, R2, (2008).

Modularity in Biological networks

Modularity in Biological networks

Presentation Transcript

Modularity in C

Modularity and community structure in networks

Modularity

BIOLOGICAL NETWORKS

Modularity and Community Structure in Networks*

“Modularity” of Social Networks

Oscillation patterns in biological networks

Biological networks

Biological Networks

Modularity…

Biological Networks

Biological Networks

Biological networks

Complex (Biological) Networks

Biological networks

Complex (Biological) Networks

Modularity

Biological networks

Complex (Biological) Networks

Biological Networks

Biological networks