610 likes | 776 Views
little b, a language for building modular models. sri international palo alto, ca weds feb 6, 2008. aneil mallavarapu department of systems biology harvard medical school. today, models are monolithic and used only by a small cadre of computational biologists
E N D
little b, a language for building modular models sri international palo alto, ca weds feb 6, 2008 aneil mallavarapu department of systems biology harvard medical school
today, models are monolithic and used only by a small cadre of computational biologists • biological complexity requires detailed accounting at the level of mathematics • how can models become a part of everyday scientific life, as gene sequences have become?
little b: • a high-level, terse, modular modeling language • designed for specifying biological systems, • and generating mathematical models
KNOWLEDGE REPRESENTATION ANALYTIC REPRESENTATION ode / simulation or steady state analysis molecules & reactions little b pde / simulation cellular structure mechanisms and kinetic constants stochastic | lattice| pi / model checking matlab little b pathways mechanisms … analysis methods
representation: high and low knowledge representation (biocyc, ingenuity, biopax) high-level specification for search and inference model specification languages (little b, bng, kappa, sbml level3) mid-level specification for model generation analytic languages (matlab, mathematic, pi calculus, maude, kappa, numerica, maple, etc.) low level specification for model execution
2 1 1 2 2 1 2 1 A B A A A A B B A A • molecular complexity Y Y Y X X X Z Z Z motivations… • location complexity
E vs + P ES E + S E S P • modeling assumptions affect the mathematical structure of the model: motivations… • reaction complexity does order, distributivity processivity matter?
E S P X E E S P E P X , , S P dE/dt = 0 dS/dt = - [E][S]kES dP/dt = [E][S]kES dE/dt = - [E][P]kEP dP/dt = - [E][P]kEP dX/dt = [E][P] kEP dE/dt = - [E][X]kEP dS/dt = - [E][S]kES dP/dt = [E][S]kES -[E][P]kEP dX/dt = [E][P]kEP changes required to add model 2 to model 1 models combine reactions and species in compartments E species and reactions … , , P X models 1 2
kes = .2; function dy = rates(t,y) dy = zeros(3,1); dy(1) = 0; % E dy(2) = -y(1)*y(2)*kes; % S dy(3) = y(1)*y(2)*kes; %P end kep = .5; function dy = rates(t,y) dy = zeros(3,1); dy(1) = -y(1)*y(2)*kep; % E dy(2) = -y(1)*y(2)*kep; % P dy(3) = y(1)*y(2)*kep; %X end kep = .5; kes = .2; function dy = rates(t,y) dy = zeros(4,1); dy(1) = -y(1)*y(4)*kep; % E dy(2) = -y(1)*y(2)*kes; % S dy(3) = y(1)*y(2)*kes % P - y(1)*y(3)*kep; dy(4) = y(1)*y(3)*kep; %X end translating to matlab code introduces further bookkeeping requirements… dE/dt = 0 dS/dt = - [E][S]kES dP/d = [E][S]kES dE/dt = - [E][P]kEP dP/dt = - [E][P]kEP dX/dt = [E][P] kEP dE/dt = - [E][X]kEP dS/dt = - [E][S]kES dP/dt = [E][S]kES -[E][P]kEP dX/dt = [E][P]kEP
E S P E S ES P + [ES]k[ES]rev + [ES]k[ES]rev - [ES]k[ES]rev assumptions affect mathematical structure 1. mechanism (e.g., # steps) dE/dt = 0 dS/dt = - [E][S]k[E][S] dP/dt = [E][S]k[E][S] dE/dt = [ES]k[ES] - [E][S]k[E][S] dS/dt = - [E][S]k[E][S] dP/dt = [ES]k[ES] dES/dt = [E][S]k[E][S] - [ES]k[ES] 6 changes + 3 = total of 9 changes
E E + S → … hill S ES P + [ES]k[ES]rev + [ES]k[ES]rev - [ES]k[ES]rev assumptions affect mathematical structure (2) 2. kinetics mass-action m1R1 + m2R2 … + mnRn→ … dE/dt = [ES]k[ES] - [E]([S]/(K+[S]))hes dS/dt = - [E]([S]/(K+[S]))hes dP/dt = [ES]k[ES] dES/dt = [E]([S]/(K+[S]))hes - [ES]k[ES] dE/dt = [ES]k[ES] - [E][S]k[E][S] dS/dt = - [E][S]k[E][S] dP/dt = [ES]k[ES] dES/dt = [E][S]k[E][S] - [ES]k[ES]
today the modeling community thinks about • representation • analytic methods • semantics • syntax
modeling language “ilities” • readability • writeability • shareability • reusability • modularity • composability • extensibility • verifiability • affordability • adaptibility • dependability • simplicity
Programs must be written for people to read, and only incidentally for machines to execute. • Abelson & Sussman, Structure and Interpretation of Computer Programs
toy egf receptor model - parts: egf “egfr+egf” egfr mapkkk mapkkk* mapkk mapkk* mapk mapk*
the awesome power of lisp macros: code which writes code … develop concise notations for particular purposes
toy egf receptor model - reactions: egf “egfr+egf” egfr mapkkk mapkkk* mapkk mapkk* mapk mapk*
mapkk mapkk mapk* mapk* mapkk mapkk mapk mapk + + + + toy egf receptor model – modular mechanism: [enzymatic-reaction {mapkk} {mapk} {mapk*} ’(:irreversible)] ES [enzymatic-reaction {mapkk} {mapk} {mapkk} ’(:reversible :irreversible)] … enzymatic-reaction generates reaction-types on your behalf
implemented as a function: … an extensible system which permits customization and sharing of kinetics: modular kinetics
in the future we can imagine … • libraries of such components have been previously defined by experts, and are available • over the web • in a database in your lab • in your own personal collection • b enables these parts to be combined
egfr mapkkk mapkk mapk egf mapkkk* ES complex (mapkkk*-mapkk) mapkk* ES complex (mapkk*-mapk) mapk* let’s describe a situation composed of predefined parts: cell-a dish
additional syntax [ ] . { } objects and field-access infix (math) syntax in-memory object database logic rules (defrule monkey-grabs-banana [near (?m [monkey ?name]) (?b [banana ?bnum])] => [has ?m ?b]) [near [monkey ’bonzo] [banana 1]] definition language extensible library system defcon – object definition defield – object methods defprop – properties defrule – logic rules species, reactions, compartments interactive environment graph manipulation symbolic mathematics translator to matlab units and dimensions a base language and libraries common lisp ANSI X3J13 +
egfr mapkkk mapkk mapk egf mapkkk* ES complex (mapkkk*-mapkk) mapkk* ES complex (mapkk*-mapk) mapk* little b builds symbolic mathematical expressions: cell-a dish “object-oriented syntax meets symbolic math” enables programmers and theorists to write & debug functions which translate between the world of objects and the world of mathematical expressions.
print-eval consistency: objects print as code fragments of expressions can be copied, pasted, evaluated … a useful manipulation and inspection capability
} shorthand for setting initial condition of all speciess of a particular type set initial conditions… and perform numerical integration in matlab
mkp mkp-gene extend the model with a phosphatase: mapkkk mapkk mapk mapk*
recap • shareable, modular biochemical models • symbolic language for describing and manipulating objects • symbolic math system brings mathematics and objects together • a terse readable, writeable format • in-memory database and reasoning capability
2 1 1 2 2 1 2 1 A B A A A A B B A A • molecular complexity Y Y Y X X X Z Z Z Section II • location complexity
erb receptor L erb L L D C erb D D C defines a class of labelled undirected graphs built upon this graph: C molecular complexes • graphs specify physical connectivity • built from components called “monomers”: (defmonomer erb L D C) a monomer with 3 sites
erb erb D L C L C D 1 1 1 1 bond sites connect (defmonomer erb L D C) by order [erb1 1_] [erb L.1D.1C._] [R L.1D.1] [R 1 1] [erb _1 1] [erb c.1 a._ b.1] by name implicitly
erb L D C connecting monomers [[erb D.1][erb D.1]] erb L D C
egf R erb L D C connecting monomers (defmonomer egf R) [[erb L.1][egf R.1]]
egf egf R R • ->> • <<->> erb erb D C D L C L * * * * specifying classes of reactions with patterns using the wildcard *: • {[erb L._ D.* C.*] + [egf R._] • [[erb L.1 D.* C.*][egf R.1]]}
{[erb L._D._ C._] + [egf R._] ->> [[erb L.1 D._ C._][egf R.1]} {[erb __] + [egf R._] ->> [[erb L.1 __][egf R.1]} the rest-bindings: ** , __ {[erb L._ D.* C.*] + [egf R._] ->> [[erb L.1 D.* C.*][egf R.1]} {[erb L._ **] + [egf R._] ->> [[erb L.1 **][egf R.1]}
state sites encode state (defmonomer mapk (b :documentation “binding site”) (p :states (member :u:p) :documentation “phosphorylation site”))) [mapk _ :u] [mapk p.u] [mapk _ :p] [mapk p.p] mapk mapk b b p p :u :p
Four members of the ERB family hrg egf • bind ligands • form homo and hetro-dimers • complex with internal components and external components • internalized into subcellular compartments erbb1 erbb2 erbb3 erbb4 • carlos lopez (sorger lab) • will chen (sorger lab)
with-substitution-table ; define 4 erb receptors with common structure (with-substitution-table ($NAMEerbb1 erbb2 erbb3 erbb4) (defmonomer $name L D C))) Expands to (progn (defmonomer erbb1 L D C) (defmonomer erbb2 L D C) (defmonomer erbb3 L D C) (defmonomer erbb4 L D C))
egf/hrg/nrg egf/hrg/nrg R R ErbBi ErbBi Erbbj Erbbj L L L L D D D D C C C C receptor dimerization +
with-data-table (with-data-table (:rows ($R1 $L) :cols $R2 :cells ($Kf $Kr) :ignore _) (( erbb1 erbb2 erbb3 erbb4) ((erbb1 egf)(.1 .3) (.1 .3) (.2 .7) (.4 .01)) ((erbb3 hrg)(.2 .2) _ (.1 .1) (.1 .1)) ((erbb4 hrg)(.3 .7) (.1 .7) (.4 .6) (.8 .1))) [{[[$R1 L.1 **][$L R.1]] + [$R2 D._ __] <<->> [[$R1 L.1 D.2 **][$L R.1][$R2 D.2 __]]} :documentation "Receptor-ligand Binding" (.set-rate-function 'mass-action :fwd $Kf :rev $Kr)]) Expands to 11 reversible reactions
lisp is the metalanguage erbb1.documentation =R=> (FLD ERBB1 :DOCUMENTATION) erbb1.(in cell.membrane) =R=> (FLD ERBB1 :IN (FLD CELL :MEMBRANE)) [[erb D.1][erb D.1]] =R=> (OBJECT COMPLEX-SPECIES-TYPE (QUOTE (ERB (FLD D 1)) (ERB (FLD D 1)))) {A + B} =R=> (MATH (+OP A B))
377 species-types 862 reaction-types Schoeberl et al, Nature Biotech 2002
egf hrg w/ Carlos Lopez Will Chen Peter Sorger 1 3 4 erbb 2 ras Pip3 shc PI3K grb sos raf dep1 phosphatases vescicle mek ptp1b mkp kinase cascade erk pdk2 19 user-specified monomers 29 user-specified reaction-patterns 442 lines (incl. comments, spaces) ~ 6.5 pages of code 247 complex-reaction-types 742 species-types (complexes) 4947 reaction-types 975 species 10,187 lines of Matlab code endosome pdk1 akt
visualization B-USER >[[erb D.1][erb D.1 L.2][egf 2]].show
1 2 2 1 2 1 2 1 B A B A A B A A A A Y Y Y X X X Z Z Z Section III • molecular complexity • location complexity
ion ion ion ion-channel ion-channel multi-compartmental reactions dish cell-a cell-b
er nucleus mito { membrane apposition multicellular / multicompartmental
von dassow et al. wrote “ingeneue” software to investigate multicellular models • topology is relatively robust to parametric variation: • ~49 params • 1/200 randomly chosen parameter sets produce pattern successfully The segment polarity network is arobust developmental module- von Dassow et al., 2000
2 2 2 3 3 3 1 1 1 6 6 6 4 4 4 5 5 5 accounting for realistic cellular lattices ingeneue’s modulo-6 arithmetic: S(c,m) reacts with S(c+1,mod(m+6,6)) representing this realistic lattice requires reasoning about geometry and computing location-sensitive identities of species matt thomson