640 likes | 718 Views
Synthesizing and Repairing Expressions using Types and Weights. Tihomir Gvero, Viktor Kuncak , Ivan Kuraj and Ruzica Piskac. Motivation. Large APIs and libraries classes in Java 6.0 standard library Using those APIs (for the first time) can be Tedious Time consuming
E N D
Synthesizing and Repairing Expressions using Types and Weights TihomirGvero, Viktor Kuncak, Ivan Kuraj and RuzicaPiskac
Motivation • Large APIs and libraries classes in Java 6.0 standard library • Using those APIs (for the first time) can be • Tedious • Time consuming • Developers should focus on solving creative tasks • Manual Solution • Read Documentation • Inspect Examples • Automation = Code synthesis + Code completion
Our Solution • InSynth: Interactive Synthesis of Code Snippets • Input: • Scala partial program • Cursor point • We automatically extract: • Declarations in scope (with/without statistics from corpus) • Desired type • Algorithm • Complete • Efficient – output N expressions in less than T ms • Effective – favor useful expressions over obscure ones • Generates expressions with higher order functions • Output • Ranked list of expressions
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= … }
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= … } newSeqInStr(newFileInStr(sig),newFileInStr(sig)) newSeqInStr(newFileInStr(sig), newFileInStr(body)) newSeqInStr(newFileInStr(body), newFileInStr(sig)) newSeqInStr(newFileInStr(body), newFileInStr(body)) newSeqInStr(newFileInStr(sig), System.in)
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= … } newSeqInStr(newFileInStr(sig),newFileInStr(sig)) newSeqInStr(newFileInStr(sig), newFileInStr(body)) newSeqInStr(newFileInStr(body), newFileInStr(sig)) newSeqInStr(newFileInStr(body), newFileInStr(body)) newSeqInStr(newFileInStr(sig), System.in)
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr(newFileInStr(sig), newFileInStr(body)) … }
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr(newFileInStr(sig), newFileInStr(body)) … } Imported over 3300 declarations Executed in less than 250ms
TreeFilter (HOF) deffilter(p: Tree => Boolean): List[Tree] = { valft:FilterTreeTraverser= ft.traverse(tree) ft.hits.toList }
TreeFilter (HOF) deffilter(p: Tree => Boolean): List[Tree] = { valft:FilterTreeTraverser= ft.traverse(tree) ft.hits.toList } newFilterTreeTraverser(x => p(x)) newFilterTreeTraverser(x => isType) newFilterTreeTraverser(x => p(tree)) newFilterTreeTraverser(x => new Wrapper(x).isType) newFilterTreeTraverser(x => p(new Wrapper(x).tree))
TreeFilter (HOF) deffilter(p: Tree => Boolean): List[Tree] = { valft:FilterTreeTraverser= ft.traverse(tree) ft.hits.toList } newFilterTreeTraverser(x => p(x)) newFilterTreeTraverser(x => isType) newFilterTreeTraverser(x => p(tree)) newFilterTreeTraverser(x => new Wrapper(x).isType) newFilterTreeTraverser(x => p(new Wrapper(x).tree))
TreeFilter (HOF) deffilter(p: Tree => Boolean): List[Tree] = { valft:FilterTreeTraverser = newFilterTreeTraverser(x => p(x)) ft.traverse(tree) ft.hits.toList }
TreeFilter (HOF) deffilter(p: Tree => Boolean): List[Tree] = { valft:FilterTreeTraverser = newFilterTreeTraverser(x => p(x)) ft.traverse(tree) ft.hits.toList } Imported over 4000 declarations Executed in less than 300ms
def m1: T1 … defmn: Tn COMPLETION = INHABITATION val a: T = ?
def m1: T1 … defmn: Tn • ={ m1: T1,…, mn: Tn} COMPLETION = INHABITATION val a: T = ?
ENVIRONMENT def m1: T1 … defmn: Tn • ={ m1: T1,…, mn: Tn} COMPLETION = INHABITATION val a: T = ?
ENVIRONMENT def m1: T1 … defmn: Tn • ={ m1: T1,…, mn: Tn} COMPLETION = INHABITATION val a: T = ? ⊢ ? : T
ENVIRONMENT def m1: T1 … defmn: Tn • ={ m1: T1,…, mn: Tn} COMPLETION = INHABITATION val a: T = ? ⊢ ? : T DESIRED TYPE
Simply Typed Lambda Calculus x : T AX ⊢ x : T , x : T1 ⊢ t : T ABS ⊢ λx.t : T1 T ⊢ e1: T1T ⊢ e2: T1 APP ⊢ e1(e2) : T
Simply Typed Lambda Calculus ⊢ ? : T
Simply Typed Lambda Calculus Backward Search ⊢ ? : T
Simply Typed Lambda Calculus ⊢ ?: T1 ⊢ ? : T1T APP ⊢ ? : T
Simply Typed Lambda Calculus ⊢ ?: T1 ⊢ ? : T1T APP ⊢ ? : T Infinitely many
Simply Typed Lambda Calculus No bound on types in derivation tree(s). ⊢ ?: T1 ⊢ ? : T1T APP ⊢ ? : T Infinitely many
Long Normal Form , x1:T1 ,…, xn:Tn⊢ t: T ABS ⊢ λx1:T1 ,…, xn:Tn.t: T1… Tn T f : T1…TnT ⊢ a1: T1 … ⊢ an: Tn APP ⊢ f(a1,…,an):T
Comparison between LNF and classic APP OLD ⊢ e1: T1T ⊢ e2: T1 APP ⊢ e1(e2) : T f : T1…TnT ⊢ a1: T1 … ⊢ an: Tn APP ⊢ f(a1,…,an):T NEW
Comparison between LNF and classic APP We derive EXPRESSION from ⊢ e1: T1T ⊢ e2: T1 APP ⊢ e1(e2) : T f : T1…TnT ⊢ a1: T1 … ⊢ an: Tn APP ⊢ f(a1,…,an):T
Comparison between LNF and classic APP We derive EXPRESSION from ⊢ e1: T1T ⊢ e2: T1 APP ⊢ e1(e2) : T f : T1…TnT ⊢ a1: T1 … ⊢ an: Tn APP ⊢ f(a1,…,an):T DECLARATION from
Long Normal Form ⊢ ? :T
Long Normal Form ⊢ ?: T1 T2 f : (T1 T2)T APP ⊢ f(?):T
Long Normal Form ⊢ ?: T1 T2 f : (T1 T2)T APP ⊢ f(?):T Only one Narrows the search space
Long Normal Form , x1:T1⊢ ?: T2 ABS ⊢ λx1:T1.? : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.?):T
Long Normal Form . . . . APP , x1:T1⊢ e : T2 ABS ⊢ λx1:T1.e : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.e):T
Long Normal Form Finitely many types in derivation tree(s) . . . . APP , x1:T1⊢ e : T2 ABS ⊢ λx1:T1.e : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.e):T
Algorithm • Algorithm builds finite graph (with cycles) that • Represents all (infinitely many) solutions • Later we use it to construct expressions • Less than 10ms • Algorithm Properties • Graph generation terminates • Type inhabitation is decidable • Complete - generates all solutions • PSPACE-complete • Techniques • Succinct type representation • Backward search • Weights mechanism
Weights and Corpus • Weight of a declaration based on: • Frequency • Corpus based on 18 Scalaprojects (e.g. Scala compiler) • Over 7500 declarations, and over 90000 uses • Higher the frequency, lower the weight • Proximity High Low Method and field symbols API symbols Local symbols
Algorithm with Weights . . . . APP , x1:T1⊢ e : T2 ABS ⊢ λx1:T1.e : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.e):T
Algorithm with Weights Choice based on WEIGHT . . . . APP , x1:T1⊢ e : T2 ABS ⊢ λx1:T1.e : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.e):T
Algorithm with Weights Choice based on WEIGHT . . . . APP , x1:T1⊢ e : T2 ABS ⊢ λx1:T1.e : T1 T2 f : (T1 T2)T APP ⊢ f(λx1:T1.e):T Ranking based on w(f(λ x1:T1.e)) = w(f) + w(x1) + w(e)
Benchmarks • 50 Java examples translated into Scala • Illustrate correct usage of API functions • We generalized the import statements • To include more declarations • In every example: • Arbitrarily chose some expression • Removed it • Marked it as goal expression • Measure whether InSynthcan recover it
Results • Without weights expected expression appears • Among top 10suggestions in only 4 benchmarks (8%) • With weights (only proximity) • Among top 10 suggestions in 48 benchmarks (96%) • As a top suggestion in 26 benchmarks (52%) • With weights (proximity + frequency) • Among top 10 suggestions in 48 benchmarks (96%) • As a top suggestion in 32 benchmarks (64%) • Average execution time 145ms
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr = … }
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr = newSeqInStr( sig, body) … }
Sequence of Streams User Writes defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr( sig, body) … }
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr( sig, body) … } //newSeqInStr: FileInStr FileInStr SeqInStr Type Mismatch
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr( sig, body) … } We propose polynomial algorithm that finds the best repair //newSeqInStr: FileInStr FileInStr SeqInStr Type Mismatch
Sequence of Streams defmain(args:Array[String]) = { varbody:String = "email.txt" varsig:String = "signature.txt" varinStream:SeqInStr= newSeqInStr( sig, body) … } // newSeqInStr: FileInStr FileInStr SeqInStr Backbone Expression