1.14k likes | 1.34k Views
Reconstructing gene regulatory networks with probabilistic models. Dirk Husmeier. Marco Grzegorczyk. Regulatory network. Network unknown. High-throughput experiments. Postgenomic. data. Machine learning. Statistics. Overview. Introduction Bayesian networks Comparative evaluation
E N D
Reconstructing gene regulatory networkswith probabilistic models Dirk Husmeier MarcoGrzegorczyk
Network unknown High-throughput experiments Postgenomic data Machine learning Statistics
Overview • Introduction • Bayesian networks • Comparative evaluation • Integration of biological prior knowledge • A non-homogeneous Bayesian network for non-stationary processes • Current work
Overview • Introduction • Bayesian networks • Comparative evaluation • Integration of biological prior knowledge • A non-homogeneous Bayesian network for non-stationary processes • Current work
Description with differential equations Concentrations Kinetic parameters q Rates
Given: Gene expression time series Can we infer the correct gene regulatory network?
Parameters q known: Numerically integrate the differential equations for different hypothetical networks
Model selection for known parameters q Gene expression time series predicted with different models Measured gene expression time series Compare Highest likelihood: best model
Model selection for unknown parameters q Gene expression time series predicted with different models Measured gene expression time series Highest likelihood: over-fitting
Bayesian model selection Select the model with the highest posterior probability: This requires an integration of the whole parameter space: This integral is usually intractable
Marginal likelihoods for the alternative pathways Computational expensive, network reconstruction ab initio unfeasible
Overview • Introduction • Bayesian networks • Comparative evaluation • Integration of biological prior knowledge • A non-homogeneous Bayesian network for non-stationary processes • Current work
Objective:Reconstruction of regulatory networks ab initio Higher level of abstraction: Bayesian networks
Bayesian networks • Marriage between graph theory and probability theory. • Directed acyclic graph (DAG) representing conditional independence relations. • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. • We can infer how well a particular network explains the observed data. NODES A B C EDGES D E F
Bayes net ODE model
Linear model [A]= w1[P1]+ w2[P2] + w3[P3] + w4[P4] + noise P1 w1 P2 A w2 w3 P3 w4 P4
Nonlinear discretized model P1 Activator P2 Activation Repressor Allow for noise: probabilities P1 Activator P2 Inhibition Conditional multinomial distribution Repressor
Model Parameters q Integral analytically tractable!
Example: 2 genes 16 different network structures Best network: maximum score
Identify the best network structure Ideal scenario: Large data sets, low noise
Uncertainty about the best network structure Limted number of experimental replications, high noise
Sample of high-scoring networks Feature extraction, e.g. marginal posterior probabilities of the edges
Sample of high-scoring networks Feature extraction, e.g. marginal posterior probabilities of the edges Uncertainty about edges High-confident edge High-confident non-edge
Can we generalize this scheme to more than 2 genes? In principle yes. However …
Number of structures Number of nodes
Complete enumeration unfeasible Hill climbing Accept move when increases
Local optimum Configuration space of network structures
Local change MCMC If accept If accept with probability Configuration space of network structures
Problem: Local changes small steps slow convergence, difficult to cross valleys. Configuration space of network structures
Problem: Global changes large steps low acceptance slow convergence. Configuration space of network structures
Can we make global changes that jump onto other peaks and are likely to be accepted? Configuration space of network structures
MCMC trace plots Conventional scheme New scheme against iteration number Plot of
Overview • Introduction • Bayesian networks • Comparative evaluation • Integration of biological prior knowledge • A non-homogeneous Bayesian network for non-stationary processes • Current work
Example: Protein signalling pathway Cell membran phosphorylation nucleus TF TF -> cell response
Evaluationon the Raf signalling pathway Receptor molecules Cell membrane Activation Interaction in signalling pathway Phosphorylated protein Inhibition From Sachs et al Science 2005
Flow cytometry data • Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins • 5400 cells have been measured under 9 different cellular conditions (cues) • Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments