210 likes | 410 Views
Motif Mining from Gene Regulatory Networks. Based on the publications of Uri Alon’s group …presented by Pavlos Pavlidis Tartu University, December 2005. Gene Regulatory Networks. From Wikipedia
E N D
Motif Mining from Gene Regulatory Networks Based on the publications of Uri Alon’s group …presented by Pavlos Pavlidis Tartu University, December 2005
Gene Regulatory Networks • From Wikipedia Gene regulatory network is a collection of DNA segments in a cell which interact with each other and with other substances in the cell, thereby governing the rates at which genes in the network are transcribed into mRNA • From DOE Gene regulatory networks (GRNs) are the on-off switches and rheostats…dynamically orchestrate the level of expression for each gene….
Why networks can regulate Gene Expression? • U. Alon and his group, stresses the importance of the building blocks of the network. • These building blocks are called motifs
Motifs • They are called also n-node subgraphs in a directed graph (The work has also been extended for undirected graphs) • They are characterized from the number n of the nodes and the relations between them – directed edges
Feed Forward Loop It regulates rapidly the production of Z
In what motifs they are interested • Not in biologically significant • They don’t know a priori if a motif is biologically significant • They can calculate statistical significance • The probability that a randomized network contains the same number or more instances of a particular motif must be smaller than P. Here P is 0.01.
Randomized Network • A randomized network is not completely randomized.It has some properties: • The same number of nodes as in the real network • For each node the number of the incoming and outgoing edges equals to the real network.
Representation of the network as a matrix M Randomization: Select randomly two cells which are 1 e.g A(1,3), B(2,1). If A’(1, 1) and B’(2, 3) are 0 then swap Goal : The randomized network must have the same sum in columns and in rows Columns: The number of outgoing edges Rows: The number of incoming edges
One more requirement: If we are looking for n-node subgraphs, then the number of n-1 node subgraphs must be the same in real and randomized networks This is doneto avoid assigning high significance to a structure only because of the fact that it includes ahighly significant substructure.
Significance of a motif • Three requirements • P < 0.01 P was estimated (or bounded) by using 1000 randomized networks. • The number of times it appears in the real network with distinct sets of nodes is at least U = 4. • The number of appearances in the real network is significantly larger than in the randomized networks: Nreal – Nrand > 0.1Nrand (Why??).
What did they find • That in biological systems as in E.coli or in S.cerevisiae only some certain types of motifs are statistically important. • When they studied other systems such as:Food webs. The database of seven ecosystem food websNeuronal networks: the neural system of C.elegans WWW OTHER KIND OF MOTIFS WHERE STATISTICALLY IMPORTANT
FFL SIM DOR
FFL • Biological Example • the L-arabinose utilization system: • Crp is the general transcription factor and AraC the specific transcription factor.
FFL • Coherent • Incoherent • Important for the speed of response
Software mDraw Network visualization tool (mfinder and network motifs visualization tool embedded)