290 likes | 309 Views
Learn schema-based program synthesis for optimally finding maximum function value using univariate and multivariate optimization techniques. Explore multiple programs generation, control for numeric solvers, and multivariate optimization algorithms.
E N D
Schema-based Program Synthesis and the AutoBayes SystemPart II Johann Schumann SGT, NASA Ames
Example • Generate a program that finds the maximum value of a function f(x): max f(x) wrt x univariate multivariate Note: the function might be given as a formula or a vector of data
Schemas for univariate optimization schema(max F wrt X, C) :- ... as before schema(max F wrt X, C) :- length(X, 1), % F is a vector of data points F(0..n) C = let(sequence([ assign(mymax,0), for(idx(I,0,n), if(select(F,I) > mymax, assign(mymax, select(F,I)), skip)... ]), comment([‘The maximum is found by iterating...’]), mymax). schema(max F wrt X, C) :- length(X, 1), % instantiate numeric solution algorithm % e.g., golden section search C = ... schema(max F wrt X, C) :- ... . .
Schema for univariate optimization schema(max F wrt X, C) :- % INPUT (Problem), OUTPUT (Code fragment) % guards length(X, 1), % calculate the first derivative simplify(deriv(F, X), DF), % solve the equation solve(true, x, 0 = DF, S), % possibly more checks % is that really a maximum? simplify(deriv(DF, X), DDF), (solve(true, x, 0 > DDF, _) -> true ; writeln(‘Proof obligation not solved automatically’) ), XP = [‘The maximum for‘, expr(F), ‘is calculated ...’], V = pv_fresh, C = let(assign(V, C, [comment(XP)]), V). . . • build the derivative: df/dx • set it to 0: 0 = df/dx • solve that equation for x • the solution is the desired maximum
Demo • Generation of multiple programs • -maxprog • -maxprog N -fastest (coarse approximation) • Control for numeric solvers • pragma schema_control_arbitrary_init_values • pragma schema_control_use_generic_optimize • Tracing pragmas • The necessity of constraints
Multivariate Optimization • Task: minimize function F(X) wrt X • Algorithm: • start somewhere • go down along the steepest slope • when you come to a flat area, return that (local) minimum • Many design decisions • where to start? • how to move? • when to stop? double* minimze(F){ double* x0 = pick_start(); int converging = 1; while (converging){ double step_length = 0.1; double step_dir = -gradient(F,x0); x1 = x0 + step_length * step_dir; if (fabs(F(x1) - F(x0)) < 0.001) converging = 0; else x0 = x1; } }
Multivariate Optimization schema(max F wrt X, C) :- % IN, OUT % guards: here none length(X,Y), Y > 1, % divide and solve subproblems schema(getStartValue(F,X), C_Start), % recursive schema calls schema(getStepDirection(F,X), C_Dir), schema(getStepSize(F,X), C_Size), % assemble code segment X0=pvar_new(X), % get a new PROGRAM variable C = block([local(X0,double)], series( [ assign(X0, C_start), while_converging(X0, assign(X0, +([X0, *([C_Dir, C_Size]))) ]) ).
Multivariate optimization II generated code for max sin(v) wrt v X0=pvar_new(X), C = block([local(X0,double)], series( [ assign(X0, C_start), while_converging(X0, assign(X0, +([X0, *([C_Dir, C_Size]))) ]) ). double v_0; double E; v_0 = -99; E = 1e10; while (E > 0.001){ y = sin(v_0); v_0 = V_0 - cos(v_0) * 0.01; E = fabs(y - sin(v_0)); } • The schemas generate code in an intermediate language • procedural elements • local variables, lambda blocks • sum(..), while_converging(..) --> loops Important: variables in specification or program are NOT Prolog variables
Why schema-based synthesis? some possibilities for getStepDir Multiple algorithm variants can be automatically constructed The “best” one is chosen by the user or selected via constraints
AB Schema Hierarchies • Schemas to break down statistical problem • Bayesian independence theorems -- works on Bayesian graphs • Schemas to solve complex statistical problems • instantiate (iterative) clustering algorithms • handling of time series problems • Schemas to solve atomic problems • instantiate PDF and maximize (symbolically) • instantiate numerical solvers (see last slides) • auxiliary schemas • initialization of clustering algorithms • data pre-processing (e.g., [0..1] normalization)
AB Schema Hierarchy • Static tree structure • AB uses two kinds of schemas • schemas for probabilistic problems • schemas for formula
Schemas and AB Model • The AB schemas have to use all information from the input specification, which is stored in the Prolog data base (AB model) • Problem: schemas can modify the model, which must be undone during backtracking • add new statistical variables • remove dependencies for subproblems • Solutions: • add model as parameters: schema(Prob, C, M_in, M_out) and everywhere else • keep a model stack (similar to the dynamic calling environments in procedural languages) and use backtrackable asserts/retracts
Backtrackable Global Stuff • Global data in Prolog are handled using assert/retract or flags. All other data are local to each clause p(X) :- q(X,Z), r(Z). % X, Y, Z local to clause • Asserts are not backtrackable p(X) :- assert(keep(X)), ..., fail. The “keep(X)” is kept in the data base even after backtracking • Work-around: add global variables as parameter to all predicates (impractical) p(X, GL_in, GL_out) :- GL_out = [keep(X)|GL_in], ... • Backtrackable bassert/bretract requires some low-level additional C-programs (but has clean semantics)
Schema Control • schema applicability is controlled via guards • order of application: order in Prolog file • How to enforce/avoid certain schemas • autobayes pragmas, but that’s not really fun • doesn’t work for nested applications: • inner loop: symbolic solutions only • outer loop: enable numeric loop • generate them all and decide later or pick “fastest” • schema control language is a research topic • extend declarative AB language • how to talk about selection of iterative algorithm in a purely declarative language?
The AB Infra Structure • term utilties • rewriting engine • symbolic system: • simplifier • abstraction (range, sign, definedness) • solver • pretty printer (code, intermediate language) • comment generation
Term utilities • implemented on top of Prolog a lot of functional-programming style predicates for • lists, sets, bags, relations • terms, AC-terms • operations • term_substitute, subsumption, differences between term sets • ...
Rewriting Engine • A lot of stuff in AB is done using rewriting (but not all) • small rewriting engine implemented in Prolog • rewriting rules are Prolog clauses • conditional rewriting, AC-style rewriting • Evaluation: • eager: apply first top-down • lazy: apply bottom up • continuation: pure bottom-up or dove-tailing • handle for attachment of prover/constraint solver • compilation of rewriting rules for higher efficiency
Rewriting Rules % NAME, STRATEGY, PROVER, ASSUMPTIONS, IN, OUT trig_simplify('sin-of-0', [eval=lazy|_] ,_,_, sin(0), 0) :- !. trig_simplify('sin-of-pi-over-6',[eval=lazy|_],_,_,sin(*([1/6, pi])),1/2) :- !. trig_simplify('cos^2+sin^2',[eval=eager|_],_,_, +(Args),+([1|Args3])) :- select(cos(X)**2, Args, Args2), select(sin(X)**2, Args2, Args3), !. • Can combine pure rewriting with Prolog programming in the body of the rewrite rule
Compilation and Rewriting • Group and compile rewrite rules (statically) ?- rwr_compile(my_simplifications, [trig_simplify, remove_const_rules ] ). • Call the rewriting engine rwr_cond(my_simplifications, true, S, T). • Calling with time-out
Symbolic System • Symbolic system implemented on top of the rewriting engine + Prolog code for solvers, etc • assumption-based rewriting • X/Y -- (not(Y = 0)) --> X • simplification (lots of rules) • calculation of derivatives (deriv(F,X) as operator) • Taylor-series expansion, ... • equation solver • polynomial solver • Gauss-elimination for sets of linear equations • sequentialization of equation systems
The AB Intermediate language • strict separation between synthesis and code generation • small procedural intermediate language with some extensions • sum(..), prod(..), simul_assign(..), while_converging(...) • Annotations for comments, and pre/post/inv formulas • code generator for different languages/targets • C++/Octave • C/Matlab, C/standalone • ADA/SparkADA, Java (both “unsupported/in work/bad shape”) • Pretty-printer to ASCII, HTML, LaTeX
Extending AutoBayes • some extensions are straight-forward: add text-book formulas • additional symbolic simplification rules might be required • adding schemas requires substantial work • “hard-coded” schema as first step • applicability constraints and control • functional mechanisms to handle scalar/vector/matrix cases are available • support for documentation generation • no schema language, Prolog syntax used
Non-Gaussian PDF • Data characteristics are modeled using probability density functions (PDFs) • Example: Gaussians, exponential, ... • AB contains a number of built-in PDFs, which can be extended (hands-on demo) • Having multiple PDFs adds a lot of power over libraries
Example • For clustering, often Gaussian distribution of data is used. • How about angles: 0 == 360 • you get 5 clusters • A different distribution (vonMises-Fisher) automatically solves this problem • In AutoBayes: just replace the “gauss” by “vonmises1” -- no programming required • multiple PDFs in one spec
Sample Generation • We have used: • MODEL ---> P ---(data)--> parameters • The model can be read the other way round: generate me random data, which are consistent with the model • MODEL ---> P ---(parameters)--> data • Very useful for • model debugging/development • debugging and assessment of synthesized algorithms
AutoBayes and Correctness • practical synthesis: forget about correct-by-construction, but • detailed math derivations, which can be checked externally (e.g., by Mathematica) • literature references in documentation/comments • generation of test harness and sample data • checking of safety properties (“AutoCert”) [Cade2002 slide set]
AutoBayes as a Prolog Program • AutoBayes is a pretty large program • ~180 prolog files, 100,000LoC (with AutoFilter) • Heavy use of • meta-programming (call, etc.) • rewriting (using an engine implemented in Prolog) • functional programming elements for all sorts of list/vector/array handling • backtracking and backtrackable global data structures • procedural (non-logical) elements, e.g., file I/O, flags, etc. • no use of modules but naming conventions • everything SWI Prolog + few C extensions to handle backtrackable global counters and flags
AutoBayes Weak Points • The input parser is very inflexible (uses Prolog operators) • Very bad error messages–often just “no” • no “schema language”: AutoBayes extension only by union of Prolog/domain specialist • Only primitive control of schema selection: need for a schema-selection mechanism • not all schemas are fully documented • large code-base, which needs to be maintained
Summary • AutoBayes suitable for a wide range of data analysis tasks • AutoBayes generated customized algorithms • AutoBayes schema-based program synthesis + symbolic • logic + functional + procedural elements used • AutoBayes extension: easy to very hard • AutoBayes debugging: a pain, but explanations and LaTeX output very helpful • AutoBayes is NASA OpenSource: bugfixes/extensions always welcome • AutoBayes has a 160+ pages Users manual • AutoBayes useful for classroom projects to PhD projects