The Structure, Function, and Evolution of Biological Systems

The Structure, Function, and Evolution of Biological Systems Instructor: Van Savage Spring 2010 Quarter 4/20/2010

Other important ways to construct gene networks: Gene regulation and motifs • Organism must be able to respond to environment and gene expression is one way to do this • One gene can create a protein that either activates or represses another gene X Y activation X Y repression X Y X* promoter region (easily evolvable)

Draw network for these regulations 2 • Arrows denote direction of regulation and give two types of edges or links for each possible gene pair • Edges are not color coded in this work, although they could be to denote activation or repression. More on that later. • Gene regulation and epistatic networks could be combined but has not been done. Regulation may be a type of epistasis. • New type of self link. Not possible for epistasis. How common are these? 1 3 5 4 6

Overrepresentation There are N2 possible edges between N genes, so chance of picking specific edge is 1/N2. If there are E edges in network, you have E chances to pick it, so p~E/N2 chance of picking edge. For a given subgraph, Nnchance of picking the n correct nodes and pg chance of picking correct g edges in subgraph. Expected number of subgraphs with n nodes and g edges is where a is symmetry number and λ=E/N is mean connectivity

Keep connectivity fixed while increasing network size For n=3 subgraphs Two edge patterns become more common Three edge patterns stay constant proportion Four and higher edge patterns become vanishingly small Only one n=3 subgraph is overrepresented: Feed Forward Loop FBL FFL 42 FFLs exist. We would expect on the order of λ3~1.7+/-1.1 0 FBLs exist. May even be selected against.

Milo et al.

Other types of networks can be represented this way

FFLs in real and random networks Maintains each node as having the same number of incoming and outgoing edges. Helps with overcounting and geometry.

Scaling of subgraph concentrations Scaling oft otal number of 3-node subgraphs is simply sum over scaling of each 3-node subgraphs

FFL concentration for real versus random ER networks

Examples of patterns found in different types of networks

What patterns do we see? Sensory information processing networks (electronic circuits, gene regulation, and neurons) seem to consist of FFLs and BiFans. Is design (circuits) or evolution (genes and neurons) better? Design has more FFLs. Will more FFLs evolve in time. Food Webs and world wide web are about fluxes of energy and information and function differently. What is purpose of bi-parallel? World wide web is less hierarchical. What about mean connectivity? How reliable are food web and gene regulation network data? Can generalize to other types of networks

Mean connectivity

Itzkovitz and Alon

Random ER vs. random geometric networks Geometric ER More likely to connect to closer nodes no notion of distance Distance could be in physical or correlation space Toroidal boundary conditions

For ER networks Scaling for geometric graphs For geometric networks <k> is the mean connectivity and what we called λ before

Connectivity function is like a pdf Example with (1-r/R)n Chance of connecting decreases from 1 at zero distance to 0 at distance R. Since we are doing a probabilistic model, this must be Normalized to be a probability distribution function (PDF). Multiply PDF by <k> to get connectivity function. Same way Itzkovitz’z two choices are constructed

Use connectivity function to calculateexpectation values of different types of subgraphs Expectation value of triangle at nodes x,y,z. Choose x as origin without loss of generality. Factor of 6 is symmetry number to deal with over counting.

Other example connectivity functions—Gaussian and step functions

Comparison of geometric and ER networks Starred graphs are overrepresented but because of geometry, not selection

Effects of radius of influence, R

Accounting for direction of arrowsmodifies field and symmetry factors

Look at ratios of subgraphs, find invariants, and express scaling in terms of them

New method for comparing patterns to geometric and ER graphs p=1 is isotropic p=0 is fully anisotropic Values in parenthesis are absolute counts for subgraph normalized to 1 in ratio

Effects of anisotropy, p, on frequencyof subgraphs and ratios p=1 is isotropic p=0 is fully anisotropic Anisotropy (directedness) increases these ratios

Problems with this approach All subgraphs are taken out of context. Is that reasonable or appropriate? How useful is it?

Second Homework set is due in two weeks (May 4, 2010).

The Structure, Function, and Evolution of Biological Systems