790 likes | 883 Views
little b short course. august 2008 harvard medical school systems biology summer course aneil mallavarapu. outline. the modeler’s burden overview of a toy model using the bgui, the little b environment molecular complexes locations data-driven code generation. little b.
E N D
little b short course august 2008 harvard medical school systems biology summer course aneil mallavarapu
outline • the modeler’s burden • overview of a toy model • using the bgui, the little b environment • molecular complexes • locations • data-driven code generation
little b • computer language which automates model construction • reads text files which describe biochemistry • produces code suitable for simulation / analysis (matlab / jacobian / others…) • full-featured programming language based in Lisp • programmer’s toolkit useful for building modeling tools
I: the modeler’s burden the difficulty of creating models by hand
14-3-3 MEK 14-3-3 MEK C-TAK1 P P P P P P P C-TAK1 KSR KSR KSR KSR KSR KSR KSR 14-3-3 C-TAK1 C-TAK1 KSR MEK 14-3-3 MEK KSR MEK C-TAK1 KSR MEK C-TAK1 P KSR molecular complexity 14-3-3 MEK P KSR C-TAK1 KSR a small number of molecules and interactions results in a large number of actual reactions and complexes affects mathematical structure
E vs + P ES E + S E S P explicit representation of the enzyme-substrate complex reaction-complexity… processive versus distributive reactions the detailed mechanistic behavior of enzymes, and modeling assumptions … also affects mathematical structure
cytoplasm S S E S P nucleus , , E E S P location complexity distinct mathematical variables account for species in different locations
E E S S P E E S S P P implies S P X S P vs. X reasoning about biochemistry
HG HG HG HG HG HG HG HG HG HG HG HG hh wg CID PH PH PH PH PH PH PH PH PH PH PH PH ptc CN en WG WG WG WG WG WG WG WG WG WG WG WG EN PTC PTC PTC PTC PTC PTC PTC PTC PTC PTC PTC PTC cid multicellular complexity hh wg CID ptc CN en EN cid
geometric complexity in the real world, lattice structure is irregular
computer-assisted model construction PRO • ↓ entry errors • ↑ models → systematic model exploration • reuse formal knowledge • enables computer-assisted verification CON • bugs → computer-generated errors • loss of control & understanding • requires verification
E S P X E E S P E P X , , S P dE/dt = 0 dS/dt = - [E][S]kES dP/dt = [E][S]kES dE/dt = - [E][P]kEP dP/dt = - [E][P]kEP dX/dt = [E][P] kEP dE/dt = - [E][X]kEP dS/dt = - [E][S]kES dP/dt = [E][S]kES -[E][P]kEP dX/dt = [E][P]kEP 3 changes required to add model 2 to model 1 models combine reactions and species in compartments E species and reactions … , , P X models 1 2
kes = .2; function dy = rates(t,y) dy = zeros(3,1); dy(1) = 0; % E dy(2) = -y(1)*y(2)*kes; % S dy(3) = y(1)*y(2)*kes; %P end kep = .5; function dy = rates(t,y) dy = zeros(3,1); dy(1) = -y(1)*y(2)*kep; % E dy(2) = -y(1)*y(2)*kep; % P dy(3) = y(1)*y(2)*kep; %X end kep = .5; kes = .2; function dy = rates(t,y) dy = zeros(4,1); dy(1) = -y(1)*y(4)*kep; % E dy(2) = -y(1)*y(2)*kes; % S dy(3) = y(1)*y(2)*kes % P - y(1)*y(3)*kep; dy(4) = y(1)*y(3)*kep; %X end translating to matlab code introduces further bookkeeping requirements… dE/dt = 0 dS/dt = - [E][S]kES dP/d = [E][S]kES dE/dt = - [E][P]kEP dP/dt = - [E][P]kEP dX/dt = [E][P] kEP dE/dt = - [E][X]kEP dS/dt = - [E][S]kES dP/dt = [E][S]kES -[E][P]kEP dX/dt = [E][P]kEP
E S P E S ES P + [ES]k[ES]rev + [ES]k[ES]rev - [ES]k[ES]rev assumptions affect mathematical structure 1. mechanism (e.g., # steps) dE/dt = 0 dS/dt = - [E][S]k[E][S] dP/dt = [E][S]k[E][S] dE/dt = [ES]k[ES] - [E][S]k[E][S] dS/dt = - [E][S]k[E][S] dP/dt = [ES]k[ES] dES/dt = [E][S]k[E][S] - [ES]k[ES] 6 changes + 3 = total of 9 changes
E E + S → … hill S ES P + [ES]k[ES]rev + [ES]k[ES]rev - [ES]k[ES]rev assumptions affect mathematical structure (2) 2. kinetics mass-action m1R1 + m2R2 … + mnRn→ … dE/dt = [ES]k[ES] - [E]([S]/(K+[S]))hes dS/dt = - [E]([S]/(K+[S]))hes dP/dt = [ES]k[ES] dES/dt = [E]([S]/(K+[S]))hes - [ES]k[ES] dE/dt = [ES]k[ES] - [E][S]k[E][S] dS/dt = - [E][S]k[E][S] dP/dt = [ES]k[ES] dES/dt = [E][S]k[E][S] - [ES]k[ES]
c m = c = area of membrane volume of extra-cellular compartment m m ! d[R] dt d[L] dt - [R][L]k - [R][L]k = = moles area dt moles volume dt a common error when writing multicompartment reactions • detailed mass balance requires accounting for exact # of moles • reaction rates are expressed in units of moles / size / time • size is a area or volume (or even length or point) that the reaction takes place in L + L k R R Naïve use of mass-action kinetics results in incompatible dimensions for L and R
the right way to write multicompartment reactions m = c = area of membrane volume of extra-cellular compartment L [L] = moles of L = concentration (L/c) first, calculate the rate of the reaction RxnRate= [R][L] k units = moles/area/time next, calculate the rate fn for L and R in moles dL dt - RxnRate * m - R/m * L/c * k * m = - R * L * k = = c for concentration, divide by the size of the compartment d[L] dt - RxnRate * m = - [R][L]k * m c = c L + L c R R k m m
bookkeeping requirements result from • model combination • kinetic assumptions • code generation • molecular, location, multicellular and geometric complexity
summary • bookkeeping requirements arise from: • model combination • kinetic assumptions • code generation • biological complexity • complexity arises from molecular interaction, separation of compartments, reactions and multicellular structure • computer-assisted model construction requires math and reasoning capabilities
KNOWLEDGE REPRESENTATION ANALYTIC REPRESENTATION ode / simulation or steady state analysis molecules & reactions little b pde / simulation cellular structure mechanisms and kinetic constants kappa | pi / model checking matlab little b pathways mechanisms … analysis methods
recap • shareable, modular biochemical models • symbolic language for describing and manipulating objects • symbolic math system brings mathematics and objects together • a terse readable, writeable format • in-memory database and reasoning capability
II: overview, a toy model: egf “egfr+egf” egfr mapkkk mapkkk* mapkk mapkk* mapk mapk*
toy egf receptor model - parts: egf “egfr+egf” egfr mapkkk mapkkk* mapkk mapkk* mapk mapk*
toy egf receptor model - reactions: egf “egfr+egf” egfr mapkkk mapkkk* mapkk mapkk* mapk mapk*
implemented as a function: … an extensible system which permits customization and sharing of kinetics: modular kinetics
egfr mapkkk mapkk mapk egf mapkkk* ES complex (mapkkk*-mapkk) mapkk* ES complex (mapkk*-mapk) mapk* describing a situation cell-a dish
additional syntax [ ] . { } objects and field-access infix (math) syntax in-memory object database logic rules (defrule monkey-grabs-banana [near (?m [monkey ?name]) (?b [banana ?bnum])] => [has ?m ?b]) [near [monkey ’bonzo] [banana 1]] definition language extensible library system defcon – object definition defield – object methods defprop – properties defrule – logic rules species, reactions, compartments interactive environment graph manipulation symbolic mathematics translator to matlab units and dimensions a base language and libraries common lisp ANSI X3J13 +
the awesome power of lisp macros: code which writes code … develop concise notations for particular purposes
egfr mapkkk mapkk mapk egf mapkkk* ES complex (mapkkk*-mapkk) mapkk* ES complex (mapkk*-mapk) mapk* little b builds symbolic mathematical expressions: cell-a dish “object-oriented syntax meets symbolic math” enables programmers and theorists to write & debug functions which translate between the world of objects and the world of mathematical expressions.
print-eval consistency: objects print as code fragments of expressions can be copied, pasted, evaluated … a useful manipulation and inspection capability
} shorthand for setting initial condition of all speciess of a particular type set initial conditions… and perform numerical integration in matlab
mkp mkp-gene extend the model with a phosphatase: mapkkk mapkk mapk mapk*
recap • shareable, modular biochemical models • symbolic language for describing and manipulating objects • symbolic math system brings mathematics and objects together • a terse readable, writeable format • in-memory database and reasoning capability
evaluate code in current tab evaluate code in file reset stop init – stronger reset the little b gui environment • input: editor, command-line or file • like Matlab, Mathematica, Perl and others • read/eval/print loop: • a form is read (symbol, number, or matched parenthesis (),{},[]) • the form is evaluated • the result is printed • next form is read • based in Lisp Interactive input (aka “Listener”) Editor Output
save the file • we’ll be saving all of our code in • My Documents/littleb/work • create the work directory • and save the file…
matlab files are written to disk evaluate the file
add the directory where the files were written to the Matlab path matlab must be told where to look for the files…
integrate the model… integrate over 100 time units
custom rate functions quickly define and use custom rates using the custom-rate fn: {e + s -> p}.(set-rate-function ’custom-rate {s ^ :hill / {s ^ :hill + :k ^ :hill}} :hill 1.5 :k .5) or… use define-custom-rate to define named rate functions here is the definition of the hill function (from the little b library, b/biochem/std-rate-functions) (define-custom-rate hill (sub &key (hill 1) (k 1)) () (store-param :hill hill) (store-param :k k) {sub ^ :hill / {:sub ^ :hill + :k ^ :hill}}) {e + s -> p}.(set-rate-function ’hill s :hill 1.5 :k .5)
Y Y Y X X X Z Z Z IV: molecular complexes
erb receptor L erb L L D C erb D D C defines a class of labelled undirected graphs built upon this graph: C molecular complexes are composed of “monomers” • graphs specify physical connectivity • built from components called “monomers”: (defmonomer erb L D C) a monomer with 3 sites
erb erb C D L C L D 1 1 1 1 bond sites connect (defmonomer erb L D C) by order [erb1 1_] [erb L.1D.1C._] [erb L.1D.1] [erb 3 3] [erb _1 1] [erb D.1 L._ C.1] by name implicitly • the bond label can be any number. • sites labelled with the same number will be connected.
erb L D C connecting monomers [[erb D.1][erb D.1]] erb L D C
egf R erb L D C connecting monomers (defmonomer egf R) [[erb L.1][egf R.1]]
egf R erb L D C many ways to write the same thing [[egf 1 _ _][erb 1 _ _]] [[egf R.2][erb L.2]] [[egf R.999][erb L.999]] [[erb L.1][egf R.1]] [[erb L.1 D._ C._][egf R.1]] [[erb L.99][egf R.99]] … etc. all represent the same entity, and the same object in memory… how?