1 / 42

Modeling and Simulation of Genetic Regulatory Systems

Modeling and Simulation of Genetic Regulatory Systems. paper by Hidde de Jong reviewed by Ulrich Basters and Christian Hahn. 0.1 Overview. Introduction Directed and undirected graphs Bayesian networks Boolean networks Generalized logical networks

landis
Download Presentation

Modeling and Simulation of Genetic Regulatory Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling and Simulation of Genetic Regulatory Systems paper by Hidde de Jong reviewed by Ulrich Basters and Christian Hahn

  2. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 0.1 Overview • Introduction • Directed and undirected graphs • Bayesian networks • Boolean networks • Generalized logical networks • Non-linear ordinary differential equations • Piecewise linear differential equations • Qualitative differential equations • Partial differential equations • Stochastic master equations • Rule based formalisms • Conclusion

  3. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 1.1 Genetic Regulatory Systems • In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extend. • The regulation of gene expression is achieved through genetic regulatory systems, structured by networks of interactions between DNA, RNA, proteins, and small molecules. • Intuitive understanding of whole dynamic is hard to obtain • Consequence: formal methods and computer tools for modeling and simulating might be an approach

  4. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 1.2 Genetic Regulatory Systems • Genes have influence on each other as they produce proteins that work as promoters or repressors on other genes. • Complex system where different concentrations of an agent trigger different actions

  5. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 1.3 Motivation • Genetic regulatory systems hard to understand in whole complexity • GOAL: Complexity reduction by appropriate models and formalisms • Better understanding of GRSes • Intuitive visualization of GRSes • Better analysis of GRSes • Models can give hints where to continue research for dependencies • Models point out important parts of the system • Gaining understanding of emergence of complex patterns of behavior from interactions between genes in a Regulatory Network

  6. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 1.4 Modeling Life-Cycle Process model refining the development of a technique that models a GRN:

  7. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 2.1 Directed and undirected Graphs – Motivation/Definition • Probably most straightforward way to model a GRN • G=<V,E> • V set of vertices • Set of edges E=<i,j> where i,j є V, head and tail of edge • Additional labels denote positive/negative influence

  8. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 2.2 Directed and undirected Graphs - Summary Advantages: • Intuitive way of visualization • Common and well explored graph algorithms can make biologically relevant predictions about GRSes: • paths between genes may reveal missing regulatory interactions or provide clues about redundancy • cycles in the network point at feedback relations • connectivity characteristics give indication of the complexity • loosely connected subgraphs point at functional modules Disadvantages: • Time does not play a role • Too much abstraction: very simplified model far from reality

  9. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 3.1 Bayesian Networks - Definition • Directed acyclic graph G=<V,E> • Vertices 1≤i≤n, iєV represent genes or other elements. Correspond to random variables Xi • Xi conditional distribution p(Xi | parents(Xi)), where parents(Xi) denotes direct regulators • Conditional Independency: i(Xi; Y | Z) expresses fact that Xi is independent of Y given Z

  10. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 3.2 Bayesian Networks – Markov Assumption • Graph encodes Markov assumption, stating that for every gene i in G the conditional independency holds • Method is used to analyse dependencies between genes, not applicable for a system-simulation • Techniques rely on a matching score to evaluate networks and search for the network with optimal score • Graphs are said to be equivalent, if they imply the same set of independencies thus forming an equivalence class (useful for determining important subgraphs) • Looking at Markov and order relations between pairs of genes may point to a relationship between the genes

  11. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 3.3 Bayesian Networks – Summary Advantages • Attractive because of solid basics in statistics (enables to deal with stochastic aspects and noisy measurements in a natural way) • Applicable also if incomplete knowledge about the system is available. • Shows up important parts of the system – usually only a few genes play an important role in large systems Disadvantages • Incomplete knowledge under-determines the network (at best a few dozen experiments provide information on transcription of thousands of genes) • Search is known as NP-hard. Heuristics are used but they do not guarantee to find a globally optimal solution • Static network – leaves out dynamic aspects → fixed by Dynamic Bayesian Networks

  12. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 4.1 Boolean Networks - Definition • State of a gene can be expressed by boolean variable expressing that it is active (=1) or inactive (=0) • Interactions between genes can be represented by boolean functions calculating the state of a gene from activation of other genes • Results in a Boolean Network:

  13. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 4.2 Boolean Networks – Definition/Properties • Method is similar to circuitry • n-vector of variables in a Boolean Network represents the state of a regulatory system of n elements, each has value 0 or 1 • So system consists of 2n states • State of an element at timepoint t+1 computed by boolean function or rule the state of k of the n elements at time point t • maps k inputs to an output value

  14. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 4.3 Boolean Networks – Properties • Transitions between states are deterministic and synchronous (outputs of elements are updated simultaneously) • Sequence of states forms a trajectory of the system • A trajectory will either reach a steady state (point attractor) or a state cycle (dynamic attractor) as number of states is finite

  15. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 4.4 Boolean Networks – Summary Advantages • Efficient analysis of large RN • Positive/negative feedback-cycles can be modeled with BN‘s Disadvantages • Strong simplifying assumptions – gene is either on or off, no in between states • Transitions assumed to occur synchronously – not usually the case, so certain behaviors may be not predicted by simulation algorithm • There are situations where boolean idealisation is not appropriate – more general methods required

  16. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 5.1 Generalized Logical Networks (GLN) – Definition • Generalizes Boolean Networks – allows variables to have more than 2 values • Transitions between states occur asynchronously • Discrete, so called logical variables being abstractions of real concentration values xi • Possible values of of element i defined by thresholds of influence on other elements – if element has influence on p other elements it may have p different thresholds

  17. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 5.2 GLN – Definition Formally: • If an element i influences p other elements, then it will have p distinct thresholds • has the possible values {0,...,p} and is defined by: • The vector denotes the logical state of the RN

  18. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 5.3 GLN – Definition • The pattern of an interaction is described by logical equations of the form: • is called the image of , which denotes the value towards which tends when the logical state is • Positive and negative feedback-loops are possible to model • Refinement of simple on/off variables in Boolean Networks

  19. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 5.4 GLN – Properties • A logical steady state occurs, when the logical state equals its image: • Since the number of logical states is finite, one can test for logical steady states, other states are called transient logical states • If the system is in a transient logical state, it will make a transition into another logical state • Since a logical variable will move into the direction of its image, the successor states can be deduced by comparing the value of a logical variable with that of its image • The logical states and transitions among them can be organized in a state transition graph • Analyses of state transitions, time delays, translation and transport can be taken into account • Improves standard Boolean Network model

  20. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 6.1 Nonlinear Ordinary Differential Equations - Definition • Models the concentration of RNA, proteins and other molecules by time-dependant variables • Gene regulation is modeled by rate equations, expressing the rate of production of a component as function of the concentrations of other components • Rate expressions have the following form: where x = [x1 , ... , xn] ≥ 0 denotes the vector of concentrations and ƒi: Rn → R a usually non-linear function • Discrete time delays τi1, ... , τin > 0 can also be represented:

  21. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 6.2 ODE - Definition • Goal: Specifying function ƒi k1,n, k2,1, ... , kn,n-1 > 0 are production constants and gamma are degradation constants • The rate expression express a balance between the number of molecules appearing, disappearing per unit time • For x1, a regulation functionr: R → R is involved whereas the concentration for i > 1 increases linearly in xi-1 • An often used regulation function is the so-called Hill curve:

  22. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 6.3 ODE - Definition • θj > 0 describes the threshold for the regulatory influence of xj to a target gene • m is stepness parameter • The h+-function ranges from 0 to 1 • An increase in xj (xj →∞) will tend to increase the expression rate of a gene (activation), • In order to express that an increase of xj will tend to decrease the expression rate (inhibition), the regulation function is replaced by:

  23. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 6.4 ODE Properties Advantages • More „realistic“ way of modeling Disadvantages • Lack of in vivo or in vitro measurements of kinetic parameters in the rate equations • Numerical parameter values are available for only a handful of well-studied systems (λ-phage) • In most cases parameter values had to be chosen such that the models were able to reproduce observed qualitative behavior • For larger models finding appropriate values may be difficult Solution • Growing availability of data could handle the problem to some extent

  24. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 7.1 Piecewise-Linear Differential Equations (PLDE) - Definition • Special case of rate equation, two simplifications: • Interactions by directly relating the expression levels of genes in the network. • Continuous sigmoid curves is approximated by discontinuous step functions • PLDEs have the following form: Where xi denotes the cellular concentrations of the product of gene i and γ > 0 the degradation rate • The function gi: Rn≥0→ R≥0 is defined as: where kil > 0 is a rate parameter, bil: R → {0,1} a combination of step functions • bil is arithmetic equivalent of a boolean function, expressing conditions under which gene is expressed at a rate kil (step function)

  25. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 7.2 PLDE - Graphical simplification • Consider an n-dimensional hyperbox defined by: • Assume that for all threshold concentrations θik of the protein encoded by gene i it holds that θik < maxi • The n-1 hyperplanes defined by the thresholds divide the box into orthants • Each orthant of the box reduces to ODEs with a constant production term μicomposed of rate parameters in bi:

  26. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 7.3 PLDE - Example • State equations corresponding to the orthant 0 ≤ x1 < θ21, θ12 < x2 ≤ max2and θ33 < x3 ≤ max3

  27. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 8.1 Qualitative Differential Equations (QDE) - Definition • Incomplete understanding GRNs and absence of quantitative knowledge → need for qualitative simulation techniques • Idea behind QDE: abstract discrete description from continuous model • Discrete abstraction then used to draw conclusions about the dynamics of the system • QDEs are abstractions of ODEs of the form: where ƒi: R → R and x take a qualitative value composed of a qualitative magnitude and direction

  28. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 8.2 QDE - Properties • The qualitative magnitude of a variable xi is a discrete abstraction of its real value, the qualitative direction is the sign of its derivate • The function ƒi is abstracted into a set of qualitative constraints • Algorithm (QSIM) generates a tree of qualitative behaviors out of an initial qualitative state consisting of qualitative values • Each behavior in the tree describes a possible sequence of state transitions from the initial state • Every qualitatively distinct behavior of the ODE corresponds to a behavior in the tree generated from the QDE (the reverse may not be true)

  29. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 8.3 QDE - Summary Problems • Limited up scalability, behavior trees quickly grow out of bounds Solutions • Using a simulation algorithm tailored to the equations, larger networks with complex feedback loops can be treated Advantages • allow weak numerical information • Integration of numerical information is more difficult to achieve in logical approaches

  30. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 8.4 QDE – HYPGENE / GENSIM • Qualitative process theory is used for construction and revision of gene regulation models • User definition and knowledge base are used by GENSIM to simulate a proposed experiment • If the predictionsdo not match, HYPGENE-algorithm generates hypothesis to explain the discrepancies • HYPGENE revises assumptions about the experimental conditions • Helps to refine the model • Both algorithms have been able to partially reproduce the experimental reasoning of the attenuation mechanism regulating the synthesis of tryptophan in E.coli

  31. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 9.1 Partial Differential Equations (PDE) – Motivation • Regulatory systems are assumed to be spatially homogenous • Important in certain situations to abstract from these assumptions • Distinguish between different compartments of a cell, for example nucleus and cytoplasm or multiple cells affecting each other • Diffusion of regulatory proteins or metabolites for one compartment to another • This is a critical feature in embryonal development

  32. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 9.2 Partial Differential Equations (PDE) – Definition • The reaction-diffusion-equation (for a row of cells): • Can be adapted to other 1- or higher dimensional spacial configurations • If number of cells is large enough, discrete variable l can be replaced by continuous variable λ representing the size of the system • Concentration variables now are defined as functions of l and t and the reaction-diffusion-equations become a partial differential equation (PDE): • Using modes or eigenfunctions of the Laplacian operator gives:

  33. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 9.3 Partial Differential Equations (PDE) – Definition • Product of gene 1, the activator, must positively regulate itself; product of gene 2, the inhibitor, must negatively regulate gene 1 • Activator-inhibitor-systems were extensively used to study the emergence of segmentation patterns in the early Drosophila embryo • Observed spacial and temporal expression patterns of genes much resemble to the models modes • Numerical simulations demonstrated that some aspects of stripe formation in the Drosophila blastoderm can indeed be reproduced this way

  34. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 9.4 Partial Differential Equations (PDE) – Properties • Shown formula still not applicable in all situations, more complex formulas were formed for several special cases • Predictions quite sensitive to the shape of the spacial domain, the boundary conditions and chosen parameter values • Models need to be simple and usually are strong abstractions of biological processes (i.e. only watch at concentrations of a few gene-products) • For larger and more complex models computational costs for finding an optimal fit between data and parameters may be prohibitively high

  35. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 10.1 Stochastic Master Equations - Motivation • Differential equations describe gene regulation in great detail • Differential equations presuppose the concentrations of substances continuously and deterministically • Both assumptions are questionable in the case of gene regulation • So, we prefer to use a discrete and stochastic approach • Discrete amounts Xof moleculesare taken as state variables, joint probability distribution p(X, t) is introduced to express probability that at time t the cell contains X1 molecules of the first species, X2 of the second, etc.

  36. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 10.2 Stochastic Master Equations - Definition • The time evolution of p(X, t) can be expressed as: • Where m is number of reactions, • αjΔtthe probability that reaction j will occur in the interval [t, t+Δt] given that system is in state X at time t • βj Δt the probability that reaction j will bring the system in state X from another state in [t, t+ Δt] • Rearranging and taking limit Δt → 0 gives the Master equation:

  37. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 10.3 Stochastic Master Equations - Properties • Master equations can be approximated by stochastic differential equations • An alternative approach would be to disregard the master equations and directly simulate the time evolution • Based on the stochastic simulation approach • Determines when the next reaction occurs and of which type it will be • Revises the state in accordance with this reaction • Continuous at the resulting next state • Master equations deal with the behavior averages, stochastic simulation provides information on individual behaviors

  38. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 10.4 Stochastic Master Equations - Summary Advantages • Simulation results in closer approximations to the molecular reality of gene regulation Disadvantages • The use of stochastic simulation is not always evident • Requires detailed knowledge • Simulation is costly

  39. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 11.1 Rule-Based Formalisms (RBF) - Definiton • Knowledge-based or rule-based simulation formalisms, permit rich knowledge about system to express in a single formalism • Consist of two components: facts and rules • The rules consist of two parts: condition and action Advantages • Capability to deal with a richer variety of biological knowledge Disadvantages • Difficulties in maintaining a consistent knowledge base • RBF cannot compete with former formalisms (quantitatively)

  40. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 12.1 Conclusions Major difficulties in modeling and simulating genetic regulatory networks: • Biochemical reaction mechanisms are not known or a incompletely known • Quantitative information and molecular concentration is only selfdom available Formalisms discussed allow GRSes to be modeled in quite different ways – depending on application:

  41. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 12.2 Expectations • Emergence of new experimental techniques promise to relieve the data bottleneck • Increasing knowledge on molecular mechanisms to model regulatory systems allow a finer level of granularity • The use of quantitative models permits larger systems to be studied at a higher precision • The expectations will bring researchers nearer to the ultimate goal: to use models that integrate gene regulation with metabolism, signal transduction, replication and repair and a variety of other celluar processes • Each of the approaches above has its merits, but neither of them seems sufficient in itself • It can be expected that a combination of the two approaches, exploiting a wide range of structural and functional information on regulatory networks, will be most effective

  42. Seminar Bioinformatics - Modelling and Simulation of Genetic Regulatory Systems - Christian Hahn, Ulrich Basters, 08/27/2003 13.1 References / Acknowledgements References (and all images taken from) • Hidde de Jong, Modeling and simulation of genetic regulatory systems: a literature review; J Comput Biol. 2002;9(1):67-103. Review.  Acknowlegements • Thanks to Marite Sirava and Thomas Schäfer at ZBI of Universität des Saarlandes for supporting us to work out this talk

More Related