1 / 21

Representation, Learning and Inference in Models of Cellular Networks

Explore cellular network models, learn inference and probabilistic models, study subnetworks, and understand Bayesian networks and Bayesian inference in cellular networks. Discover the lac Operon and representation techniques.

sromig
Download Presentation

Representation, Learning and Inference in Models of Cellular Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Representation, Learning and Inference in Models of Cellular Networks BMI/CS 576 www.biostat.wisc.edu/bmi576/ Colin Dewey cdewey@biostat.wisc.edu Fall 2010

  2. Various Subnetworks within Cells • metabolic: describe reactions through which enzymes convert substrates to products • regulatory (genetic): describe interactions that control expression of particular genes • signaling: describe interactions among proteins and (sometimes) small molecules that relay signals from outside the cell to the nucleus • note: these networks are linked together and the boundaries among them are not crisp

  3. Figure from KEGG database gene products other molecules

  4. Part of the E. coli Regulatory Network Figure from Wei et al., Biochemical Journal 2004

  5. A Signaling Network Figure from Sachs et al., Science 2005

  6. Two Key Tasks • learning: given background knowledge and high-throughput data, try to infer the (partial) structure/parameters of a network • inference: given a (partial) network model, use it to predict an outcome of biological interest (e.g. will the cells grow faster in medium x or medium y?) • both of these are challenging tasks because typically • data are noisy • data are incomplete – characterize a limited range of conditions • important aspects of the system not measured – some unknown structure and/or parameters

  7. Transcriptional Regulation Example: the lac Operon in E. coli E. coli can use lactose as an energy source, but it prefers glucose. How does it switch on its lactose-metabolizing genes?

  8. The lac Operon: Repression by LacI lactose absent  protein encoded by lacI represses transcription of the lac operon

  9. The lac Operon: Induction by LacI lactose present  protein encoded by lacI won’t bind to the operator (O) region

  10. The lac Operon: Activation by Glucose glucose absent  CAP protein promotes binding by RNA polymerase; increases transcription

  11. Network Model Representations • directed graphs • Boolean networks • differential equations • Bayesian networks and related graphical models • etc.

  12. Probabilistic Model of lac Operon • suppose we represent the system by the following discrete variables L (lactose) present, absent G (glucose) present, absent I (lacI) present, absent C (CAP) present, absent lacI-unbound true, false CAP-bound true, false Z (lacZ) high, low, absent • suppose (realistically) the system is not completely deterministic • the joint distribution of the variables could be specified by 26× 3 - 1 = 191 parameters

  13. Motivation for Bayesian Networks • Explicitly state (conditional) independencies between random variables • Provide a more compact model (fewer parameters) • Use directed graphs to specify model • Take advantage of graph algorithms/theory • Provide intuitive visualizations of models

  14. L I G C lacI-unbound CAP-bound Z A Bayesian Network for the lac System Pr ( L ) Pr ( lacI-unbound | L, I ) Pr ( Z | lacI-unbound, CAP-bound )

  15. Bayesian Networks • Also known as Directed Graphical Models • a BN is a Directed Acyclic Graph (DAG) in which • the nodes denote random variables • each node X has a conditional probability distribution (CPD) representing P(X | Parents(X)) • the intuitive meaning of an arc from X to Y is that X directly influences Y • formally: each variable X is independent of its non-descendants given its parents

  16. L I G C lacI-unbound CAP-bound Z Bayesian Networks • a BN provides a factored representation of the joint probability distribution • this representation of the joint distribution can be specified with 20 parameters (vs. 191 for the unfactored representation)

  17. Pr( D | A, B,C ) A F T Pr(D = T) = 0.9 B F T Pr(D = T) = 0.5 C F T Pr(D = T) = 0.8 Pr(D = T) = 0.5 Representing CPDs for Discrete Variables • CPDs can be represented using tables or trees • consider the following case with Boolean variables A, B, C, D Pr( D | A, B,C )

  18. Representing CPDs for Continuous Variables U1 U2 … Uk • we can also model the distribution of continuous variables in Bayesian networks • one approach: linear Gaussianmodels X • X normally distributed around a mean that depends linearly on values of its parents ui

  19. The Inference Task in Bayesian Networks L Given: values for some variables in the network (evidence), and a set of query variables Do: compute the posterior distribution over the query variables • variables that are neither evidence variables nor query variables are hidden variables • the BN representation is flexible enough that any set can be the evidence variables and any set can be the query variables I G C lacI-unbound CAP-bound Z

  20. The Parameter Learning Task • Given: a set of training instances, the graph structure of a BN • Do: infer the parameters of the CPDs • this is straightforward when there aren’t missing values, hidden variables L I G C lacI-unbound CAP-bound Z

  21. The Structure Learning Task • Given: a set of training instances • Do: infer the graph structure (and perhaps the parameters of the CPDs too)

More Related