1 / 75

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme. Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics Scotland 2) Centre for Systems Biology at Edinburgh. Systems Biology. Protein activation cascade. Cell membran. phosphorylation.

tyne
Download Presentation

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics Scotland 2) Centre for Systems Biology at Edinburgh

  2. Systems Biology

  3. Protein activation cascade Cell membran phosphorylation nucleus TF TF -> cell response

  4. Raf signalling network From Sachs et al Science 2005

  5. unknown high-throughput experiments postgenomic data machine learning statistical methods

  6. Differential equation models • Multiple parameter sets can offer equally plausible solutions. • Multimodality in parameters space: point estimates become meaningless. • Overfitting problem  not suitable for model selection. • Bayesian approach: computing of marginal likelihood computationally challenging.

  7. Bayesian networks • Marriage between graph theory and probability theory. • Directed acyclic graph (DAG) representing conditional independence relations. • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. • We can infer how well a particular network explains the observed data. NODES A B C EDGES D E F

  8. Learning Bayesian networks P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data

  9. MCMC in structure space Madigan & York (1995), Guidici & Castello (2003)

  10. Alternative paradigm: order MCMC

  11. Instead of MCMC in structure space

  12. MCMC in order space

  13. Problem: Distortion of the prior distribution

  14. A B A B A B B A A B

  15. A B 0.5 A B A B 0.5 B A A B

  16. 0.5 A B 0.5 A B 0.5 A B 0.5 B A A B

  17. 0.5 A B 0.5 A B 0.5 A B 0.5 0.5 B A A B 0.5

  18. 0.5 0.25 A B 0.5 A B 0.5 0.5 A B 0.5 0.5 B A 0.25 A B 0.5

  19. Current work with Marco Grzegorczyk Proposed new paradigm • MCMC in structure space rather than order space. • Design new proposal moves that achieve faster mixing and convergence.

  20. First idea Propose new parents from the distribution: • Identify those new parents that are involved in the formation of directed cycles. • Orphan them, and sample new parents for them subject to the acyclicity constraint.

  21. 1) Select a node 2) Sample new parents 3) Find directed cycles 5) Sample new parents for these parents 4) Orphan “loopy” parents

  22. Path via illegal structure Problem: This move is not reversible

  23. Devise a simpler move that is reversible • Identify a pair of nodes X Y • Orphan both nodes. • Sample new parents from the “Boltzmann distribution” subject to the acyclicity constraint such the inverse edge Y X is included. C1 C2 C1,2 C1,2

  24. 1) Select an edge 2) Orphan the nodes involved 3) Constrained resampling of the parents

  25. This move is reversible!

  26. 1) Select an edge 2) Orphan the nodes involved 3) Constrained resampling of the parents

  27. Simple ideaMathematical Challenge: • Show that condition of detailed balance is satisfied. • Derive the Hastings factor … • … which is a function of various partition functions

  28. Acceptance probability

  29. Ergodicity • The new move is reversible but … • … not irreducible A B A B A B • Theorem: A mixture with an ergodic transition kernel gives an ergodic Markov chain. • REV-MCMC: at each step randomly switch between a conventional structure MCMC step and the proposed new move.

  30. Evaluation • Does the new method avoid the bias intrinsic to order MCMC? • How do convergence and mixing compare to structure and order MCMC? • What is the effect on the network reconstruction accuracy?

  31. Results • Analytical comparison of the convergence properties • Empirical comparison of the convergence properties • Evaluation of the systematic bias • Molecular regulatory network reconstruction with prior knowledge

  32. Analytical comparison of the convergence properties • Generate data from a noisy XOR • Enumerate all 3-node networks t

  33. Analytical comparison of the convergence properties • Generate data from a noisy XOR • Enumerate all 3-node networks • Compute the posterior distributionp° • Compute the Markov transition matrixA for the different MCMC methods • Compute the Markov chainp(t+1)= A p(t) • Compute the (symmetrized) KL divergence KL(t)= <p(t), p°> t

  34. Solid line: REV-MCMC. Other lines: structure MCMC and different versions of inclusion-driven MCMC

  35. Results • Analytical comparison of the convergence properties • Empirical comparison of the convergence properties • Evaluation of the systematic bias • Molecular regulatory network reconstruction with prior knowledge

  36. Empirical comparison of the convergence and mixing properties • Standard benchmark data: Alarm network (Beinlich et al. 1989) for monitoring patients in intensive care • 37 nodes, 46 directed edges • Generate data sets of different size • Compare the three MCMC algorithms under the same computational costs  structure MCMC(1.0E6)  order MCMC(1.0E5)  REV-MCMC(1.0E5)

More Related