490 likes | 576 Views
Pathway Modeling and Problem Solving Environments. Cliff Shaffer Department of Computer Science Virginia Tech Blacksburg, VA 24061. The Fundamental Goal of Molecular Cell Biology. Application: Cell Cycle Modeling. How do cells convert genes into behavior? Create proteins from genes
E N D
Pathway Modeling andProblem Solving Environments Cliff Shaffer Department of Computer Science Virginia Tech Blacksburg, VA 24061
Application:Cell Cycle Modeling • How do cells convert genes into behavior? • Create proteins from genes • Protein interactions • Protein effects on the cell • Our study organism is the cell cycle of the budding yeast Saccharomyces cerevisiae.
G1 cell division S DNA replication M (mitosis) G2
Pds1 Esp1 Esp1 Sister chromatid separation unaligned chromosomes SBF Cdh1 Pds1 Mcm1 Cdc20 PPX Net1P Tem1-GDP Cdc15/MEN Lte1 Bub2 Tem1-GTP Net1 Cdc14 RENT Sic1 Sic1 P Mcm1 Cdh1 Cdc20 Cln2 Clb2 Clb5 Mitosis Mad2 growth APC-P unaligned chromosomes Mcm1 Cdc20 Cdh1 Clb2 APC Inactive trimer Cdc14 and Cln3 Swi5 CDKs SCF P Cdc14 Bck2 Inactive trimer ? MBF Clb5 DNA synthesis Clb2 SBF Cln2 Budding
Modeling Techniques • One method: Use ODEs that describe the rate at which each protein concentration changes • Protein A degrades protein B: … with initial condition [A](0) = A0. Parameter c determines the rate of degradation. • Sometimes modelers use “creative” rate laws to approximate subsystems
synthesis synthesis binding degradation degradation inactivation activation Mathematical Model
G1 S/M Simulation of the budding yeast cell cycle mass CKI Cln2 Clb2 Cdh1 Cdc20 Time (min)
k1 = 0.0013, v2’ = 0.001, v2” = 0.17, k3’ = 0.02, k3” = 0.85, k4’ = 0.01, k4” = 0.9, J3 = 0.01, J4 = 0.01, k9 = 0.38, k10 = 0.2, k5’ = 0.005, k5” = 2.4, J5 = 0.5, k6 = 0.33, k7 = 2.2, J7 = 0.05, k8 = 0.2, J8 = 0.05, … Parameter values Differential equations Experimental Data
Tyson’s Budding Yeast Model • Tyson’s model contains over 30 ODEs, some nonlinear. • Events can cause concentrations to be reset. • About 140 rate constant parameters • Most are unavailable from experiment and must set by the modeler
Fundamental Activities • Collect information • Search literature (databases), Lab notebooks • Define/modify models • A user interface problem • Run simulations • Equation solvers (ODEs, PDEs, deterministic, stochastic) • Compare simulation results to experimental data • Analysis
Our Mission: Build Software to Help the Modelers • Typical cycle time for changing the model used to be one month • Collect data on paper lab notebooks • Convert to differential equations by hand • Calibrate the model by trial and error • Inadequate analysis tools • Goal: Change the model once per day. • Bottleneck should shift to the experimentalists
Another View • Current models of simple organisms contain a few 10s of equations. • To model mammalian systems might require two orders of magnitude in additional complexity. • We hope our current vision for tools can supply one order of magnitude. • The other order of magnitude is an open problem.
JigCell Current Primary Software Components: • JigCell Model Builder • JigCell Run Manager • JigCell Comparator • Automated Parameter Estimation (PET) • Bifurcation Analysis (Oscill8) http://jigcell.biol.vt.edu
Model Builder Optimum Parameter Values Parameter Values Run Manager Comparator Parameter Optimizer
JigCell Model Builder • From a wiring diagram…
JigCell Model Builder • …to a reaction mechanism N.B. Parameters are given names, not numerical values! … to ordinary differential equations (ode files, SBML)
Mutations • Wild type cell • Mutations • Typically caused by gene knockout • Consider a mutant with no B to degrade A. • Set c = 0 • We have about 130 mutations • each requires a separate simulation run
Derived Set (mutant A) Derived Set (mutant B) Derived Set (mutant C) Derived Set (mutant A’) Derived Set (mutant AB) Derived Set (mutant A’C) Basal Set (wild-type) Run Manager • Inheritance patterns
Phenotypes • Each mutant has some observed outcome (“experimental” data). Generally qualitative. • Cell lived • Cell died in G1 phase • Model should match the experimental data. • Model should not be overly sensitive to the rate constants. • Overly sensitive biological systems tend not to survive
Comparator • Visualize results Kumagai1 Kumagai2
Optimization • How to decide on parameter values? • Key features of optimization • Each problem is a point in multidimensional space • Each point can be assigned a value by an objective function • The goal is to find the best point in the space as defined by the objective function • We usually settle for a “good” point
Error Function Parameter Optimization orthogonal distance regression Levenberg-Marquardt algorithm
Parameter Optimization Only 1 experiment shown here. The model must be fitted simultaneously to many different experiments.
Composition Motivation • Models are reaching the limits of manageability due to an increase in: • Size • Complexity • Making a model suitable for stochastic simulation increases the number of reactions by a factor of 3-5. • Models of the mammalian cell cycle will require 100-1000 reactions (even more for stochastic simulation).
Model Composition • Notice that the yeast cell diagram contains natural components
Composition Processes • Fusion • Merging two or more existing models • Composition • Build up model hierarchy from existing models by describing their interactions and connections • Aggregation • Connects modular blocks using controlled interfaces (ports) • Flattening • Convert hierarchy back into a single “flat” model for use with standard simulators
Composition Wizard • Final Species Mapping Table
Composition Wizard • Final Reaction Mapping Table
Composition in SBML • Virginia Tech’s proposed language features to support composition/aggregation being written into forthcoming SBML Level 3 definition
Stochastic Simulation • ODE-based (deterministic) models cannot explain behaviors introduced by random nature of the system. • Variations in mass of division • Variations in time of events • Differences in gross outcomes
Gillespie’s Stochastic Simulation Algorithm • There is a population for each chemical species • There is a “propensity” for each reaction, in part determined by population • Each reaction changes population for associated species • Loop: • Pick next reaction (random, propensity) • Update populations, propensities • Slow, there are approximations to speed it up
Comments on Collaboration • Domain team routinely underestimates how difficult it is to create reliable and usable software. • CS team routinely underestimates how difficult it is to stay focused on the needs of the domain team. • Partial solution: truly integrate.
How to Succeed in CBB • Programming skills are necessary but not sufficient • Math is usually the biggest bottleneck • Statistics for Bioinformatics • Numerical analysis, optimization, differential equations for computational biology • Chemistry/biochemistry are good choices for domain knowledge • You have to have an “interdisciplinary attitude”