500 likes | 685 Views
Software Synthesis using Automated Reasoning. Ruzica Piskac. SVARM 2011 Saarbrücken, Germany. Software Synthesis. automated reasoning (theorem prover). specification (formula). code. Software Synthesis. val bigSet = .... val ( setA , setB ) = choose ((a: Set, b: Set) ) =>
E N D
Software Synthesis using Automated Reasoning Ruzica Piskac SVARM 2011 Saarbrücken, Germany
Software Synthesis automated reasoning (theorem prover) specification (formula) code
Software Synthesis valbigSet = .... val (setA, setB) = choose((a: Set, b: Set) ) => ( a.size == b.size && a union b == bigSet && a intersect b == empty)) Code val n = bigSet.size/2 valsetA = take(n, bigSet) valsetB = bigSet −− setA
Software Synthesis valbigSet = .... val (setA, setB) = choose((a: Set, b: Set) ) => ( a.size == b.size && a union b == bigSet && a intersect b == empty)) Code assert (bigSet.size % 2 == 0) val n = bigSet.size/2 valsetA = take(n, bigSet) valsetB = bigSet −− setA
Software Synthesis • Software synthesis = a technique for automatically generating code given a specification • Why? • ease software development • increase programmer productivity • fewer bugs • Challenges • synthesis is often a computationally hard task • new algorithms are needed
Software Synthesis – Then and Now • Studied for a long time • Church Synthesis Problem • Zohar Manna, Richard J. Waldinger: “A Deductive Approach to Program Synthesis” [1980] • Recent increased interests • due to increasing computational power and improvements in automated reasoning • Sumit Gulwani (MSR): search-based techniques • Ras Bodik (Berkley): Programming by Sketching
Complete Functional Synthesisjoint work with ViktoRKuncak, Mikael Mayer and Philippe Suter[PLDI 2010, CAV 2010, CACM 2011/2]
Synthesis as programming language construct … valx = readInteger() + 4 valy1 = println(“y1 = “ + y1) … choose(y ⇒ 5*x + 7*y == 31)
“choose” Construct • specification is part of the Scala language • two types of arguments: input and output • a call of the form • corresponds to constructively solving the quantifier elimination problemwhere a is a parameter valx1= choose(x ⇒ F( x, a ))
Complete Functional Synthesis • Note: pre(a) is the “best” possible • Definition (Synthesis Procedure) • A synthesis procedure takes as input formula F(x, a) and outputs: • a precondition formula pre(a) • list of terms Ψ • such that the following two implications are valid:
From Decision Procedure to Synthesis Procedure • based on quantifier elimination / model generating decision procedures • implemented for logic of linear integer (rational, real) arithmetic, for Boolan Algebra with Presburger Arithmetic (BAPA)
Complete Functional Synthesis • complete = the synthesis procedure is guaranteed to find code that satisfies the given specification • functional = computes a function that satisfies a given input / output relation • Important features: • code produced this way is correct by construction – no need for further verification • a user does not provide hints on the structure of the generated code
Synthesis for Linear Integer Arithmetic – Example / Overview choose((h: Int, m: Int, s: Int) ⇒ ( h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60 )) Returned code: assert (totalSeconds ≥ 0) val h = totalSecondsdiv 3600 valtemp = totalSeconds+ (-3600) * h valm = temp div 60 vals = totalSeconds + (-3600) * h + (-60)* m
Synthesis Procedure - Overview • process every equality: take an equality Ei, compute a parametric description of the solution set and insert those values in the rest of formula • for n output variables, we need n-1 fresh new variables • number of output variables decreased for 1 • based on Extended Euclidean Algorithm • compute preconditions
Synthesis Procedure - Overview • at the end there are only inequalities – similar procedure as in [Pugh 1992]
Parametric Solution of Equation Theorem For an equation with S we denote the set of solutions. • Let SH be a set of solutions of the homogeneous equality: SH = { y | } SH is an “almost linear” set, i.e. can be represented as a linear combination of vectors: SH = λ1s1 + ... λn-1sn-1 • Let w be any solution of the original equation • S = w + λ1s1 + ... λn-1sn-1 + preconditions: gcd(i)| C
Solution of a Homogenous Equation Theorem For an equation with SH we denote the set of solutions. where values Kij are computed as follows: • if i < j, Kij = 0 (the matrix K is lower triangular) • if i =j • for remaining Kij values, find any solution of the equation
Finding any Solution (n variables) • Inductive approach • 1x1 + 2x2 +... + nxn = C 1x1 + gcd(2,...,n )[λ2x2 +... + λnxn] = C 1x1 + xF = C • find values for x1 (w1) and xF (wF) and then solve inductively: λ2x2 +... + λnxn = wF
Finding any Solution (2 variables) • based on Extended Euclidean Algorithm (EEA) • for every two integers n and m finds numbers p and q such that n*p + m*q = gcd(n, m) • problem: 1x1 + 2x2 = C • solution: • apply EEA to compute p and q such that 1p + 2q = gcd(1, 2) • solution: x1 = p*C/ gcd(1, 2) x2 = q*C/ gcd(1, 2)
Linear Integer Arithemtic: Example val (x1, y1) = choose(x: Int, y: Int => 2*y − b =< 3*x + a && 2*x − a =< 4*y + b) valkFound = false for k = 0 to 5 do {val v1 = 3 * a + 3 * b − kif (v1 mod 6 == 0) {val alpha = ((k − 5 * a − 5 * b)/8).ceilingval l = (v1 / 6) + 2 * alphaval y = alphavalkFound = true break } } if (kFound) val x = ((4 * y + a + b)/2).floor else throw new Exception(”No solution exists”) Precondition: ∃k. 0 ≤ k ≤ 5 ∧ 6|3a + 3b − k
Handling of Inequalities • Solve for one by one variable: • separate inequalities depending on polarity of x: • Ai ≤ αix • βjx ≤ Bj • define values a = maxi⌈Ai/αi⌉ and b = minj ⌊Bj/ βj⌋ • if b is defined, return x = b else return x = a • further continue with the conjunction of all formulas ⌈Ai/αi⌉ ≤ ⌊Bj/ βj⌋
Algorithm for more than one Variable • ⌈(2y − b − a)/3⌉ ≤ ⌊(4y + a + b)/2⌋ • ⇔ ⌈(2y − b − a) ∗ 2/6⌉ ≤ ⌊(4y + a + b) ∗ 3/6⌋ • ⇔ (4y − 2b − 2a)/6 ≤ [(12y + 3a + 3b) − (12y + 3a + 3b) mod 6]/6 • ⇔ (12y + 3a + 3b) mod 6 ≤ 8y + 5a + 5b • ⇔ 12y + 3a + 3b = 6 ∗ l + k ∧ k ≤ 8y + 5a + 5b Consider the formula 2y − b ≤ 3x + a ∧ 2x − a ≤ 4y + b
Algorithm for more than one Variable • 12y + 3a + 3b = 6 ∗ l + k ∧ k ≤ 8y + 5a + 5b • upon applying the equality, we obtain • preconditions: 6|3a + 3b − k • solutions: l = 2λ + (3a + 3b − k)/6 and y = λ • substituting those values in the inequality results in k − 5a − 5b ≤ 8λ • final solution: λ = ⌈(k − 5a − 5b)/8⌉ Consider the formula 2y − b ≤ 3x + a ∧ 2x − a ≤ 4y + b
Synthesis for Sets val (setA, setB) = choose((a: Set[O], b: Set[O]) ) => (−maxDiff <= a.size − b.size && a.size − b.size <= maxDiff && a union b == bigSet && a intersect b == empty)) Code val kA = ((bigSet.size + maxDiff)/2).floor valsetA = take(kA, bigSet) valsetB = bigSet −− setA Precondition ⌈|bigSet| − maxDiff/2⌉ ≤ ⌊|bigSet| + maxDiff/2⌋
Interactive Synthesis of code snippetsJoint work with TihomirGvero and Viktor Kuncak[Under Submission, CAV2011]
isynth - Interactive Synthesis of Code Snippets deffopen(name:String):File = { ... } deffread(f:File, p:Int):Data = { ... } varcurrentPos : Int = 0 varfname : String= null ... defgetData():Data = Returned value: fread(fopen(fname),currentPos)
Interactive Synthesis of Code Snippets Complete functional synthesis • for specific theories (linear arith, sets, multisets) • based on generating models isynth: • works for any operations in API • based on finding proofs • code completion using automated reasoning • specification is given through type constraints and test cases
isynth – Polymorphic Types def map[A,B](f:A => B, l:List[A]): List[B] = { ... } defstringConcat(lst : List[String]): String = { ... } ... defprintInts(intList:List[Int], prn: Int => String): String = Returned value: stringConcat(map[Int, String](prn, intList))
isynth - Interactive Synthesis of Code Snippets • supports method combinations, type polymorphism, user preferences • based on first-order resolution – combines forward and backward reasoning • ranking of returned solutions is obtained through a system of weights • http://lara.epfl.ch/w/isynth
System of weights in isynth • Symbol weights • Term weights • Recalculate weight of terms with user preferred symbols Low High User preferred Method and field symbols API symbols Arrow Local symbols
Related Work / Reading Material Comfusy project: • M. Mayer, R. Piskac, P. Suter: “Complete Functional Synthesis”, PLDI 2010 • V. Kuncak, M. Mayer, R. Piskac, P. Suter: “Comfusy: A Tool for Complete Functional Synthesis”, CAV 2010 • A. Solar-Lezama, L. Tancau, R. Bodík, S. A. Seshia, V. A. Saraswat: “Combinatorial sketching for finite programs”. ASPLOS 2006 • S. Jha, S. Gulwani, S. A. Seshia, A. Tiwari: “Oracle-guided component-based program synthesis”. ICSE (1) 2010 • S. Srivastava, S. Gulwani, J.S. Foster: From program verification to program synthesis. POPL 2010 • S. Gulwani: Dimensions in program synthesis. PPDP 2010 isynth project: • T. Gvero, R.Piskac, V. Kuncak. “Interactive Synthesis of Code Snippets”, CAV 2011 • The Agda Project [http://wiki.portal.chalmers.se/agda/] • D. Mandelin, L. Xu, R. Bodık, D. Kimelman: “Jungloid mining: helping to navigate the API jungle”, PLDI '05
Conclusions Software Synthesis • method to obtain correct software from the given specification • Complete Functional Synthesis: extendingdecision procedures into synthesis algorithms • Interactive Synthesis of Code Snippetsfinding a term of a given type Contributions to/from automated reasoning • decision procedures • methods to combine them
Quantifier Elimination for Linear Integer Arithmetic • Problem of great interest: • [Presburger, 1929], [Cooper, 1972] • [Pugh, 1992], • [Weispfenning, 1997] • [Nipkow 2008] – verified in Isabelle • Our algorithm for integers: • Efficient handling equalities • Handling of inequalities as in [Pugh 1992] • Computes witness terms , builds a program from them
Synthesis for Linear Integer Arithmetic – Example / Overview choose((h: Int, m: Int, s: Int) ⇒ ( h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60 )) Returned code: val h = totalSecondsdiv 3600 valtemp = totalSeconds + ((-3600) * h) valm = min(temp div 60, 59) vals = totalSeconds + ((-3600) * h) + (-60 * m)
Synthesis Procedure for Linear Integer Arithmetic • Works on disjunctive normal form • Preprocessing: • standard elimination of negations and divisibility constraints: • From now on: we only consider formulas that are a conjunction of equalities and inequalities
Overview of a Synthesis Procedure • process every equality: take an equality Ei, compute a parametric description of the solution set and insert those values in the rest of formula • for n output variables, we need n-1 fresh new variables • number of output variables decreased for 1 • compute preconditions • at the end there are only inequalities – same procedure as in [Pugh 1992]
Synthesis for Linear Integer Arithmetic – Example / Overview choose((x, y) ⇒ 5 * x + 7 * y == a && x ≤ y)) • Corresponding quantifier elimination problem: ∃ x ∃ y . 5x + 7y = a ∧ x ≤ y • give parametric description for all solutions of 5x + 7y = a: x = -7z + 3a y = 5z - 2a • Rewrite inequation x ≤ y in terms of z: 5a ≤ 12z → z ≥ ceil(5a/12) • a
Synthesis for Linear Integer Arithmetic – Example / Overview choose((x, y) ⇒ 5 * x + 7 * y == a && x ≤ y)) • Obtain synthesized program: valz = ceil(5*a/12) valx = -7*z + 3*a valy = 5*z + -2*a
Parametric Solution of Equation Theorem For an equation with S we denote the set of solutions. • Let SH be a set of solutions of the homogeneous equality: SH = { y | } SH is an “almost linear” set, i.e. can be represented as a linear combination of vectors: SH = λ1s1 + ... λn-1sn-1 • Let w be any solution of the original equation • S = w + λ1s1 + ... λn-1sn-1 + preconditions: gcd(i)| C
Solution of a Homogenous Equation Theorem For an equation with SH we denote the set of solutions. where values Kij are computed as follows: • if i < j, Kij = 0 (the matrix K is lower triangular) • if i =j • for remaining Kij values, find any solution of the equation
Finding any Solution (n variables) • Inductive approach • 1x1 + 2x2 +... + nxn = C 1x1 + gcd(2,...,n )[λ2x2 +... + λnxn] = C 1x1 + xF = C • find values for x1 (w1) and xF (wF) and then solve inductively: λ2x2 +... + λnxn = wF
Finding any Solution (2 variables) • based on Extended Euclidean Algorithm (EEA) • for every two integers n and m finds numbers p and q such that n*p + m*q = gcd(n, m) • problem: 1x1 + 2x2 = C • solution: • apply EEA to compute p and q such that 1p + 2q = gcd(1, 2) • solution: x1 = p*C/ gcd(1, 2) x2 = q*C/ gcd(1, 2)
Handling of Inequalities • Solve for one by one variable: • separate inequalities depending on polarity of x: • Ai ≤ αix • βjx ≤ Bj • define values a = maxi⌈Ai/αi⌉ and b = minj ⌊Bj/ βj⌋ • if b is defined, return x = b else return x = a • further continue with the conjunction of all formulas ⌈Ai/αi⌉ ≤ ⌊Bj/ βj⌋
Algorithm for more than one Variable • ⌈(2y − b − a)/3⌉ ≤ ⌊(4y + a + b)/2⌋ • ⇔ ⌈(2y − b − a) ∗ 2/6⌉ ≤ ⌊(4y + a + b) ∗ 3/6⌋ • ⇔ (4y − 2b − 2a)/6 ≤ [(12y + 3a + 3b) − (12y + 3a + 3b) mod 6]/6 • ⇔ (12y + 3a + 3b) mod 6 ≤ 8y + 5a + 5b • ⇔ 12y + 3a + 3b = 6 ∗ l + k ∧ k ≤ 8y + 5a + 5b Consider the formula 2y − b ≤ 3x + a ∧ 2x − a ≤ 4y + b
Algorithm for more than one Variable • 12y + 3a + 3b = 6 ∗ l + k ∧ k ≤ 8y + 5a + 5b • upon applying the equality, we obtain • preconditions: 6|3a + 3b − k • solutions: l = 2λ + (3a + 3b − k)/6 and y = λ • substituting those values in the inequality results in k − 5a − 5b ≤ 8λ • final solution: λ = ⌈(k − 5a − 5b)/8⌉ Consider the formula 2y − b ≤ 3x + a ∧ 2x − a ≤ 4y + b
Linear Integer Arithemtic: Example val (x1, y1) = choose(x: Int, y: Int => 2*y − b =< 3*x + a && 2*x − a =< 4*y + b) valkFound = false for k = 0 to 5 do {val v1 = 3 * a + 3 * b − kif (v1 mod 6 == 0) {val alpha = ((k − 5 * a − 5 * b)/8).ceilingval l = (v1 / 6) + 2 * alphaval y = alphavalkFound = true break } } if (kFound) val x = ((4 * y + a + b)/2).floor else throw new Exception(”No solution exists”) Precondition: ∃k. 0 ≤ k ≤ 5 ∧ 6|3a + 3b − k
Synthesis for Sets val (setA, setB) = choose((a: Set[O], b: Set[O]) ) => (−maxDiff <= a.size − b.size && a.size − b.size <= maxDiff && a union b == bigSet && a intersect b == empty)) Code val kA = ((bigSet.size + maxDiff)/2).floor valkB = bigSet.size − kA valsetA = take(kA, bigSet) valsetB = take(kB, bigSet −− setA) Precondition ⌈|bigSet| − maxDiff/2⌉ ≤ ⌊|bigSet| + maxDiff/2⌋
Generic Techniques • Multiple output variables - computed one by one • Disjunctions – computed one by one, combined using if ... else ... expressions (NP-hardness) output variables: y1, ..., yn assert pren+1(x) val y1= Ψ1(x) val y2= Ψ2 (x, y1) .... valyn= Ψn (x , y1, ..., yn-1) pre1(x, y1, ..., yn-1) pren(x,y1)