1 / 30

CSC 599: Computational Scientific Discovery

Lecture 8: Lagramge (sp) and Inductive Process Modeling. CSC 599: Computational Scientific Discovery. Outline. Review Processes BACON Other equation finders The act of equation finding Lagramge (sp) ‏ Inductive Process Modeling. Review: Processes. Deal with changes over time Range

Download Presentation

CSC 599: Computational Scientific Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 8: Lagramge (sp) and Inductive Process Modeling CSC 599: Computational Scientific Discovery

  2. Outline Review • Processes • BACON • Other equation finders • The act of equation finding Lagramge (sp)‏ Inductive Process Modeling

  3. Review: Processes Deal with changes over time • Range (start, finish), (start, duration)‏ • Rates of change dx/dt • Previous history attribute[event[t]] = f(event[1], event[2], . . . event[t-1])‏ Attributes of processes (when time is included)‏ • Quantity of “always on” forces as function of time • Gravity, electro-magnetism • Maximum limit of “homeostatic” forces • Friction, Normal force • Misc. changes during process

  4. Review: BACON BACON was driven by the data and domain knowledge: • Distinguishing between independent and dependent attribute • BACON 3 • Discovery of intrinsic properties of conceptual values • BACON 4 • Preference for symmetric equations given knowledge that suggests one could exist • BACON 5

  5. Equation Finders, 1990s [From Ljupco Todorovski's Homepage:] • http://www-ai.ijs.si/~ljupco/ • COPER [Kokar 1986] • Uses information about the dimension units of the system variables to restrict the space of possible equation structures. • BACON [Langley 1987] • Pioneer among equation discovery systems. It uses a set of data-driven heuristics for finding regularities (constancies and trends) in data and for formulating hypotheses based on them.

  6. Equation Finders, 1990s (2)‏ • Fahrenheit/EF[Langley and Zytkow 1989], [Zembowitz and Zytkow 1992] • Used as a equation discovery subsystem of the scientific discovery system. For discovering bivariate equations only, user being able to specify the set of operators and functions to be used within equations. • ABACUS [Falkenhainer and Michalski 1990] • Experiments with different search strategies through the space of equation structures. Also allows discovery of piecewise equations using clustering for identifying the limits between pieces. • IDS [Nordhausen and Langley 1990] • ARC [Moulet 1992]

  7. Equation Finders, 1990s (3)‏ • E* [Schaffer 1993] • Discovers of bivariate equations using a small set of predefined equation structures. • LAGRANGE [Dzeroski and Todorovski 1995] • Handled differential equations. • GOLDHORN [Krizman et al. 1995] • Extended LAGRANGE towards discovery from noisy data. • SDS [Washio and Motoda 1997] • Used information about scale types of the dimension units of the system variables to restrict the space of possible equations. • LAGRAMGE [Todorovski and Dzeroski 1997] • Allowed the user to specify the space of possible equations with context free grammar.

  8. Issues with equation finders • How to incorporate time Esp. derivatives • How separate • Heuristics used to search eqn space • Grading fnc used to determine how well eqn fits: • Compare: • True function = x2 • Current function = -x2 • Large “error” in numeric space • Small “error” in symbolic (eqn space)‏ • How best to incorporate domain knowledge? • BACON programs did so, but was ad hoc

  9. Lagrange: The Person • Italian/French • Mathematician and physicist • 1736-1813 The System • Mid-1990s equation finder • Found differential equations • Dzeroski and Todorovski 1995

  10. Lagramge (sp)‏ Lagramge = Lagrange (system) + grammar Handling the issues: • How to account for time? • Each vector of variable values can represent the state of the system at a particular time • Throw both attr and d[attr]/dt in variable set • Separating • Heuristics for searching equation space Explicit user-defined function for judging fnc similiarity • Gauging equation fit • Sum of squared errors

  11. Lagramge, top algorithm void lagramge (Grammar g, int maxComplexity, int maxBeamWidth, fnc fittingFnc(), fnc eqnPreferFnc(), fnc stoppingCriteriaMet()‏ )‏ { T0 = init symbol Q1 = {T0} do { Q = Q1 R = Q.generateRefinements(); foreach r in R { r.calcBestFitConstants(fittingFnc); r.symbolicCloseness = eqnPreferFnc.calc(r); } Q1 = union(Q,R); Q1.keepBest(maxBeamWidth,symbolicCloseness); } while ( (Q != Q1) && !stoppingCriteriaMet() ); }

  12. Heuristic functions: fnc fittingFnc()‏ Job = given this equation parse tree, find best fitting parameters attrlearn = c1*attr1 + c2*attr2 + c3* attr1*attr2 + c4 Distinguish between: • space of equation parse trees (countable)‏ • space of all the floating point parameters (in principle, uncountable)‏ fnc eqnPreferFnc()‏ Job = given this instantiated equation, how well does it fit the data? Traditional least squares, or some variant • Noise tolerant

  13. Lagramge Handling Issues (2)‏ 3. Incorporation of domain knowledge Use a grammar! Inputs: • Context free grammar (More on next slide)‏ • Input data D = (V,vd,M)‏ V = set of variables vd = variable to predict in V M = measurements of all vars: time v1 v2 . . . vn t0 v1,0 v2,0 vn,0 t1 v1,1 v2,1 vn,1 . . . Outputs: Ordinary or differential equation

  14. Grammar Context Free Grammar = <nonTerminalSet, terminalSet, productionSet, startSym> Ex: double monod(double c, double v)‏ { return(v/(v+c)); } N = {E, F, M, v} T = {+, const, *, monod, (“,”), N, P, Z} P = { E -> const | const*F | E + const*F F -> v | M | v*M M -> monod(const,v) } S = E

  15. Tree's “Height” as its complexity The deeper the tree, the more complex the expression y = f1(x), y = f1(f2(x)), y = f1(f2(f3(x)))‏ Sym Height = shallowest height of deepest prod. p = A -> A1,A2, . . .Al h(p) = 1 + maxi {h(Ai)} h(A) = if (isNonTerminal(A)) minq in prods of A{h(q)} else /* isTerminal(A) */ 0 height, production 1 E->const; v->N; v->P; V->Z 2 M->monod(const,v); F->v 3 F->M; F->v*M; E->const*F; E->E+const*F

  16. Lagramge Refinement void refinement (Tree t, int maxHeight)‏ { choose A, nonterminal in T pA,i = production applied at that node l = pathLength from root T to A delete subtrees of A if ( (i+1) <= A.numProductionsFor()) && (l + height(pA,i+1) <= maxHeight) ) { pA,i+1 = A->A1,A2, . . . Al replace pA,i with pA,i+1 expand A with successors A1,A2, . . . Al do { choose nonTerm leaf B if T pB,1 = B->B1,B2, . . . Bm expand B with successors B1,B2, . . . Bm } while ( T.hasNonTerminals() ); }

  17. Refinement Example

  18. Lagramge Findings (1): Aquatic Ecosystems: • N = concentration of nutrient • P = concentration of phytoplankton • Z = concentration of zooplankton dN/dt = -NP/(kN + N)‏ dP/dt = NP/(kN + N) – rPP – PZ(kP+P)‏ dZ/dt = PZ(kP+P) - rZZ

  19. Lagramge Findings (2): Two poles on a cart (inverted pendulum):

  20. Lagramge Discussion Problems it solves • Handling noisy data • sum of squared error • Handling of time and derivatives • Derivatives are just another variable • Nothing special to do on user's part • Distinguishes between error in symbol space and error in numeric value space • Heuristic fnc for 1st, grading fnc for 2nd • Incorporation of domain knowledge • Symbol space heuristic fnc • Grammar

  21. Lagramge Discussion (2)‏ Hold up one second! Do you believe this? Incorporation of domain knowledge • Grammar Grammer ?!? • Computer scientists think in terms of grammars • FSM/regular expression • PDA/context free • Turing Machines/context sensitive • Do scientists think in terms of grammars? (Besides linguists, of course)‏

  22. Lagramge Discussion (3)‏ Remember BACON 5 heuristic: • If have structural knowledge that suggests a symmetric equation than look for one first • Can we do that with Lagramge? Grammar for symmetry(?)‏ A -> BCB (not quite, both B's are no distinct)‏ A -> c*B + c*D (symmetric but not that powerful)‏ A -> (B) (ditto)‏ Preference for symmetry • Jury-rig eqnPreferFnc() as needed • Are you happy with that?

  23. Inductive Process Modeling

  24. IPM We want it all! • Processes • Explicitly recognized objects (in software engineering sense)‏ • Distinguish between instances and classes • higher level constructs with uninstantiated equations • Simulation equations constructed from process ones • Automatically deals with time • Exhaustive search • To some maximal complexity • Better fitting function for time series data

  25. IPM Generic Processes Library pred_prey generic process logistic_growth; • variables S[species]; • parameters gr[0,3], ic[0,0,1]; • equations d[S,t,1] = gr * S * (1-ic*S)‏ generic process exponential_growth; variables S{species}; parameters gr[0,3]; equations d[S,t,1] = gr * S; generic process exponential_decay; variables S{species}; parameters dr[0,2]; equations d[S,t,1] = -1*dr * S; generic process holling_1; variables S1{prey}, S2{predator}; parameters ar[0.01,10], ef[0.001,0.8]; equations d[S1,t,1] = -1 * ar * S1 * S2; d[S2,t,1] = ef * ar * S1 * S2

  26. IPM Quantitative Processes: Fills in constants from equation templates: Model PredatorPrey: vars: aurelia{prey}, nasutum{predator}; observable: aurelia, nasutum; process aurelia_growth; equations d[aurelia,t,1] = 1.81 * aurelia * (1-0.0003*aurelia); process nasutum_decay equations d[nasutum,t,1] = -1 * 1.04 * nasutum; process predation_holling_t: equations d[aurelia,t,1] = -1 * 0.03 * aurelia * nasutum; d[nasutum,t,1] = 0.30 * 0.03 * aurelia * nasutum; d[aurelia]/dt = 1.81 * aurelia * (1-0.0003*aurelia) – 0.03 * aurelia * nasutum d[nasutum]/dt = -1.04 * nasutum + 0.03 * aurelia * nasutum

  27. IPM Algorithm 1. Find all permissible instantiations of generic processes with specified variables: logistic_growth: S -> aurelia logistic_growth: S -> nasutum exponential_decay: S -> aurelia exponential_decay: S -> nasutum holling_1: S1 -> aurelia, S2 -> nasutum

  28. IPM Algorithm (2)‏ 2. Go from partially instantiated processes to generic models by carrying out exhaustive search of model structures, up to some complexity limit process logistic_growth; parameters gr[0,3], ic[0,0.1]; equations d[aurelia,t,1] = gr * aurelia * (1-ic*aurelia)‏ process exponential_decay; parameters dr[0,2]; equations d[nasutum,t,1] = -1 * dr * nasutum process holling_1: parameters ar[0.01, 10], ef[0.001, 0.8] equations d[aurelia,t,1] = -1 * ar * aurelia * nasutum d[nasutum,t,1] = ef * ar * aurelia * nasutum

  29. IPM Algorithm (3)‏ 3. Find best fitting parameters w/Least squares fitting that does 2nd order gradient descent thru parameter space • Full simulation • Applies standard tricks for avoiding local mins /or/ when all variables observable • Does teacher forcing • No full simulation • Given attributes at time t, minimize error at t+1

  30. IPM Discussion Finds sophisticated predator/prey relationships in synthetic and actual data • Multiple predators and prey species Successes • Applies domain knowledge in scientist-friendly fashion • Groups equations and equation fragments together in clumps that a scientist would find intuitive Improvement? • It's an algorithm, not an architecture • Algorithm could be incorporated in a larger arch.

More Related