460 likes | 600 Views
Banana Algebra:. Syntactic Language Extension via an Algebra of Languages and Transformations. Jacob Andersen [ jacand@cs.au.dk ] Aarhus University. Claus Brabrand [ brabrand@itu.dk ] IT University of Copenhagen. The Banana Algebra Café. The " Banana Algebra Café " :
E N D
Banana Algebra: Syntactic Language Extension via an Algebra of Languages and Transformations Jacob Andersen [ jacand@cs.au.dk ] Aarhus University Claus Brabrand [ brabrand@itu.dk ] IT University of Copenhagen
The Banana Algebra Café • The "Banana Algebra Café": • Located in Costa Rica • City of Cahuita (pop. 3,000): Banana Breakfast! Cahuita!
Outline • Introduction: "What is a Banana?" • Bananas for Language Transformation • Language Extension Pattern • Banana Algebra • Examples • Implementation • Related Work • Conclusion
What is a 'Banana' ? • Datatype; "list": • Banana("sum-of-list"): • Implicit recursion on input structure • Separation of recursion and evaluation • bottom-up re-combination of intermediate results list = Num int | Cons int * list listint [Num n]=n [Cons nl]=n+[l] (aka. "Catamorphism" ) (|n.n, (n,l ).n+l|) Another Ex.: "length-of-list"
Language Transformation • Bananas (statically typed): • Source language: 'LS' • Target language: 'LT' • Nonterminal-typing: '' • Reconstructors: 'c' (| LS -> LT [] c |) LS -> LT list = Num int | Cons int * list tree = Nil | Leaf int | Node tree * tree [list -> tree] [Num n]= Leaf n [Cons nl]= Node (Leaf n) [l] Type-check'able!
"Growing Languages with Metamorphic Syntax Macros"[ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 ) "The metafront System: Safe and Extensible Parsing and Transformation"[ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003, SCP J. 2007 ) Statically reduce: Banana Algebra (term) Banana (const) Banana Properties • Banana properties: • Simple(corresponds to: “simple recursion”) • Safe(syntactically safe + always terminate) • Efficient(linear time in size of input + output) • (Expressive)(…enough for interesting extensions) • Banana Algebra “for free” (16 banana ops): • Modular • Incremental • Simple • Safe • Efficient • (Expressive) Been around for many years We now propose
Outline • Introduction: "What is a Banana?" • Bananas for Language Transformation • Language Extension Pattern • Banana Algebra • Examples • Implementation • Related Work • Conclusion
Language Extension Pattern Numeral extension: Lambda-Calculus: 'LS' 'LT' Exp : var Id : lam Id * Exp : app Exp * Exp : zero : succ Exp : pred Exp Exp : var Id : lam Id * Exp : app Exp * Exp '' Nonterminal typing: [Exp -> Exp] Reconstructors: 'c' [var V] = var V [lam VE] = lam V[E] [app E1E2] = app [E1][E2] [zero] = lam z (var z) [succ E] = lam s [E] [pred E] = app [E] (lam z (var z)) Catamorphism: (| LS -> LT [] c |) Using very simple numeral encoding
Algebraic Solution ln+ll + lnl (| ln -> l[Exp -> Exp] [zero] = lam z (var z) [succ E] = lam s [E] [pred E] = app [E] ... |) ll idx l ln Exp : var Id : lam Id * Exp : app Exp * Exp Exp : zero : succ Exp : pred Exp
Extending Java java+repeatjava + repeatjava javajava (| repeat -> java[Stm -> Stm, Exp -> Exp] [repeat S E] = do-while [S] (not [E]) |) idx java repeat Java grammar:( 575 lines !!! ) ... Stm : "repeat" Stm "until" "(" Exp ")" ";"
Languages (L): l v L\L L+L src( X ) tgt( X ) letv =LinL letxw =XinL Transformations (X): x w X\L X+X XX idx( L ) letv =LinX letxw =XinX Banana Algebra (|L -> L[] c |) { CFG }
Algebraic Laws • Idempotency of '+': • Commutativity of '+': • Associativity of '+': • Source-identity: • … LL + L L1 + L2L2 + L1 L1 + (L2 + L3) (L1 + L2) + L3 Target-identity: Ltgt(idx(L)) Lsrc(idx(L))
Outline • Introduction: "What is a Banana?" • Bananas for Language Transformation • Language Extension Pattern • Banana Algebra • Examples • Implementation • Related Work • Conclusion
Example Revisited --- "ln2l.x" --- letl = "l.l" inletln = "ln.l" in idx(l) + (| ln -> l[Exp -> Exp] Exp.zero = '\z.z' ; Exp.succ = '\s.$1' ; Exp.pred = '($1 \z.z)' ; |) --- "l.l" --- --- "ln.l" --- { Id = [a-z] [a-z0-9]* ; Exp.var : Id ; Exp.lam : "\\" Id "." Exp ; Exp.app : "(" Exp Exp ")" ; } { Exp.zero : "zero" ; Exp.succ : "succ" "(" Exp ")" ; Exp.pred : "pred" "(" Exp ")" ; }
Numerals + Booleans …with Nums & Bools? l+ln+lbl + …with Nums …with Bools lb+ll ln+ll + + ll lbl lnl idx idx l lb ln l
Java + Repeat --- "java.l" --- 575 lines { Java ... "try" Stm "catch" ... Name.id : Id ; } --- "repeat.l" --- { Stm.repeat : "repeat" Stm "until" "(" Exp ")" ";" ; } --- "repeat2java.x" --- letjava = "java.l" inletrepeat = "repeat.l" in idx(java) + (| repeat -> java[Exp -> Exp, Stm -> Stm] Stm.repeat = 'do $1 while (!($2));' ; |) 7 lines !
Concrete vs. Abstract Syntax Concrete syntax: Stm.repeat = 'do $1 while (!($2));' ; Exp (with explicit assoc./prec.): Abstract syntax: Stm.repeat = Stm.do(<1>, Exp.exp1( Exp1.exp2( Exp2.exp3( Exp3.exp4( Exp4.exp5( Exp5.exp6( Exp6.exp7( Exp7.neg( Exp8.par(<2>) ))))))))) ; Exp.or : Exp1 "||" Exp ; .exp1 : Exp1 ; Exp1.and : Exp2 "&&" Exp1 ; .exp2 : Exp2 ; Exp2.add : Exp3 "+" Exp2 ; .exp3 : Exp3 ; Exp7.neg : "!" Exp8 ; .exp8 : Exp8 ; Exp8.par : "(" Exp ")" ; .var : Id ; .num : IntConst ; (unambiguous: concrete abstract) NB: Tool supportsBOTH !
"FUN" Example The "FUN" Language: used for Teaching Functional Programming (at Aarhus University) Fun Basically The Lambda Calculus with…: numerals, booleans, arithmetic, boolean logic, local definitions, pairs, literals,lists, signs, comparisons, dynamic types, fixed-point combinators, … Fun grammar transform Literals Literals→Nums Unsigned arithmetic + booleans + definitions + pairs Nums→λ Bools→λ Defs→λ Pairs→λ + + + Lambda Calculus
"FUN" Example Component re-use Fun Fun + FunSigned Fun grammar transform Fun grammar transform + FunSigned GT Literals Literals→Nums Literals→Nums Signed arith→Nums Unsigned arithmetic + booleans + definitions + pairs Nums→λ Bools→λ Defs→λ Pairs→λ + + + Lambda Calculus
"FUN" Example Fun + FunSigned + FunCompare + FunTypesafe Fun GT + FunSigned GT + FunCompare GT + FunTypesafe GT 245x Banana Algebra ops 4 MB Banana ! Unsigned arithmetic + booleans + definitions + pairs Nums→λ Bools→λ Defs→λ Pairs→λ + + + Lambda Calculus
"FUN" Usage Statistics • Usage statistics (245x operators) in "FUN": • 58x { …cfg… }Constant languages • 51x "file.l"Language inclusions • 28x L + LLanguage additions • 23x vLanguage variables • 17x (|LL[]c|)Constant transformations • 17x X + XTransformation additions • 14x "file.x"Transformation inclusions • 10x let-inLocal definitions • 9x idx(L)Identity transformations • 8x XXCompositions • 4x L \ L Language restriction • 4x wTransformation variables • 2x src(X)Source extractions
EXERCISE Incremental Development --- "li.l" --- --- "l.l" --- { Id = [a-z] [a-z0-9]* ; Exp.var : Id ; Exp.lam : "\\" Id "." Exp ; Exp.app : "(" Exp Exp ")" ; } { Exp.id : "id" ; } --- "li2l.x" --- let l = "l.l" in idx(l) + (| "li.l" -> l[Exp -> Exp] Exp.id : '\z.z' ; |) --- "ln.l" --- { Exp.zero : "zero" ; Exp.succ : "succ" Exp ; Exp.pred : "pred" Exp ; } --- "ln2l.x" --- --- "ln2li.x" --- let l = "l.l" in idx(l) + (| "ln.l" -> l[Exp -> Exp] Exp.zero : '\z.z' ; Exp.succ : '\x.$1' ; Exp.pred : '($1 \z.z)' ; |) let l = "l.l" in idx(l) + (| ln -> l+"li.l" [Exp -> Exp] Exp.zero : 'id' ; Exp.succ : '\x.$1' ; Exp.pred : '($1 id)' ; |) --- "ln2l.x" --- "li2l.x" o "ln2li.x"
Example cont'd • Both statically reduce to samecatamorphism: (| Exp.app : Exp.app($1, $2) ; Exp.lam : Exp.lam($1, $2) ; Exp.pred : Exp.app($1, Exp.lam(Id("z"), Exp.var(Id("z")))) ; Exp.succ : Exp.lam(Id("x"), $1) ; Exp.var : Exp.var($1) ; Exp.zero : Exp.lam(Id("z"), Exp.var(Id("z"))) ; |) { Id = [a-z] [0-9a-z]* ; Exp.app : "(" Exp Exp ")" ; Exp.lam : "\" Id "." Exp ; Exp.pred : "pred" Exp ; Exp.succ : "succ" Exp ; Exp.var : Id ; Exp.zero : "zero" ; } { Id = [a-z] [0-9a-z]* ; Exp.app : "(" Exp Exp ")" ; Exp.lam : "\" Id "." Exp ; Exp.var : Id ; } -> [Exp -> Exp, Id->Id]
Other Examples • Self-Application(The tool on itself!): • SQL embedding(in <bigwig>): • My-Java (endless variations): [L1 << L2] = '(L1 \ L2) + L2' [X1 << X2] = '(X1 \ src(X2)) + X2' Stm.select = 'factor (<2>) { if (<3>) return ( # \+ (<1>) ); }' java ( + sql) ( \ loops) o syntaxe_francais
Implementation The 'Banana Algebra' Tool: (3,600 lines of O'Caml) [ http://www.itu.dk/people/brabrand/banana-algebra/ ] Uses(underlying technologies): 'dk.brics.grammar': for parsing, unparsing, and ambiguity analysis ! 'XSugar': for transformation: "concrete syntax abstract XML syntax" 'XSLT': for transformation: "XML XML"
Outline • Introduction: "What is a Banana?" • Bananas for Language Transformation • Language Extension Pattern • Banana Algebra • Examples • Implementation • Related Work • Conclusion
Related Work (I/III) • Macro Systems: "Growing Languages with Metamorphic Syntax Macros"[ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 ) "The metafront System: Safe and Extensible Parsing and Transformation"[ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003 , SCP J. 2007 )
Both; compared to bananas: More ambitious (expressivity) No termination guarantees (safety) Transformation "indirect" (simplicity) Related Work (II/III) • Attribute Grammars: • Language transformation (and extension)… • …via computation on AST's (using "inherited" or "synthesized" or … attributes) • E.g., Eli, JastAdd, Silver, … • Rewrite Systems: • Language transformation (and extension)… • …via syntactic rewriting, using encodings…: • gradually rewrite "S-syntax" to "T-syntax" • gradually rewrite "S-syntax" to "T-syntax" • E.g., Elan, TXL, ASF+SDF, Stratego/XT, … ST ST
Related Work (III/III) • Functional Programming: • Catas mimicked by "disciplined style" of fun. programming • …aided by: • Traversal functions (auto-synthesized from datatypes) • Combinator libraries • "Shortcut fusion" (to eliminate ' ' at compile-time) • Category Theory: • A lot of this work can be viewed as Category Theory: Basically ye olde issue: GPL vs. DSL
Statically reduce: Banana Algebra (term) Banana (term) Conclusion • IFbananasare sufficiently: • (Expressive) • THEN you get…: • Banana Algebra “for free” (16 banana ops): • Incremental • Modular • Simple • Safe • Efficient "Niche"
BONUS SLIDES - Reduction Semantics - If you want all the details: "Syntactic Language Extension via an Algebra of Languages and Transformations"[ Jacob Andersen | Claus Brabrand ] ( ITU Technical Report, Dec. 2008 )
Reduction Semantics • Environments: • Reduction relations: • Abbreviations: • ...as a short-hand for: • ...as a short-hand for: ENVL = VARLEXPL environment of languages ENVX = VARXEXPX environment of transformations ENVLENVXEXPLEXPL 'L' ENVLENVXEXPXEXPX 'X' ,|- L L l (,,L,l) 'L' ,|- X X x (,,X,x) 'X'
Semantics (L) [CONL] [VARL] l wfl ,lLl ,vL (v) ,LLl ,L'Ll' [RESL] ,L \ L'Lll' l ,LLl ,L'Ll' l~l' [ADDL] l ,L + L'Lll' l
Semantics (L) ,XX (| lS -> lT [] c |) [SRCL] ,src (X)LlS ,XX (| lS -> lT [] c |) [TGTL] ,tgt (X)LlT [v=l],L'Ll' ,LLl [LETL] ,letv=LinL'Ll'
Semantics (X) (|lS->lT[]c|) ,LTLlT ,LSLlS wfx [CONX] ,(| LS -> LT [] c |)X (| lS -> lT [] c |) ,XXx ,LLl [VARX] ,wX (w) [RESX] ,X \ LXxl x ,XXx ,X'Xx' x~x' [ADDX] x ,X + X'Xxx' x
Semantics (X) ,XX (| lS -> lT [ ] c |) ,X'X (| lS' -> lT' ['] c' |) lT lS' [COMPX] l ,X'XX (| lS -> lT' [' ] c' c |) ,LLl [IDXX] ,idx (L)X (| l -> l [id(l)] idc(l) |) ,[w=x]X' Xx' ,XXx [LETL] ,letxw=XinX'Xx'
BONUS SLIDES - More Examples -
Numeral & Boolean Extension • Numeral Extension (catamorphism): • Boolean Extension (catamorphism): [var V] = var [V] [lam VE] = lam [V][E] [app E1E2] = app [E1][E2] [zero] = lam z (var z) [succ E] = lam s [E] [pred E] = app [E] (lam z (var z)) Exp : var Id : lam Id * Exp : app Exp * Exp : zero : succ Exp : pred Exp Exp : var Id : lam Id * Exp : app Exp * Exp [var V] = var [V] [lam VE] = lam [V][E] [app E1E2] = app [E1][E2] [true] = lam a (lam b (var a)) [false] = lam a (lam b (var b)) [if E1E2E3] = app (app [E1][E2]) [E3] Exp : var Id : lam Id * Exp : app Exp * Exp : true : false : if Exp Exp Exp Exp : var Id : lam Id * Exp : app Exp * Exp
Lambda with Booleans lb+ll + lbl (| lb -> l[Exp -> Exp] [true] = '\a.\b.a' [false] = '\a.\b.b' [if E1 E2 E3] = '(([E1][E2]) [E3])' |) ll idx l lb Exp : var Id : lam Id * Exp : app Exp * Exp Exp : true : false : if Exp Exp Exp
Incremental Development --- "li.l" --- --- "l.l" --- { Id = [a-z] [a-z0-9]* ; Exp.var : Id ; Exp.lam : "\\" Id "." Exp ; Exp.app : "(" Exp Exp ")" ; } { Exp.id : "id" ; } --- "li2l.x" --- let l = "l.l" in idx(l) + (| "li.l" -> l[Exp -> Exp] Exp.id : '\z.z' ; |) --- "ln.l" --- { Exp.zero : "zero" ; Exp.succ : "succ" Exp ; Exp.pred : "pred" Exp ; } --- "ln2l.x" --- --- "ln2li.x" --- let l = "l.l" in idx(l) + (| "ln.l" -> l[Exp -> Exp] Exp.zero : '\z.z' ; Exp.succ : '\x.$1' ; Exp.pred : '($1 \z.z)' ; |) let l = "l.l" in idx(l) + (| ln -> l+"li.l" [Exp -> Exp] Exp.zero : 'id' ; Exp.succ : '\x.$1' ; Exp.pred : '($1 id)' ; |) --- "ln2l.x" --- "li2l.x" o "ln2li.x"
Usage Scenarios • Programmers: • May extend existing languages (~ syntax macros) • Developers: • May embedDSLs into host languages (SQL in Java) • Developers (and teachers): • May incrementally specify multi-layeredlanguages • Compiler writers: • May rely on tool and implement only a small core • (and then specify the rest externally as extensions)
BONUS SLIDES - Parsing & Error Reporting -
Parsing • Parsing(XSugar): • Variant of Earley's algorithm: O( ||3 ) • Can parse anycontext-free grammar • Closed under union of languages • Support for production priority • Tool easily adapts to other parsing algorithms
. ASTL / ~L L Ambiguity: parsingunparsing • Unparsing: • Canonical whitespace . . ASTL / ~L L . . • Parsing: • Grammar ambiguity
Ambiguity Analysis • Ambiguity Analysis: • Using implementation ( ) on: • Sourcelanguage; • Target language; and/or • …all intermediate languages (somewhat expensive) • (Note: Ambiguity analysis comes with XSugar tool) "Analyzing Ambiguity of Context-Free Grammars"[ Claus Brabrand | Robert Giegerich | Anders Møller ] ( CIAA 2007 ) "dk.brics.grammar" [ by Anders Møller ]
Error Reporting • Error reporting: • Static parse-error (O'Caml-lex): • Static transformation error (XSugar): • (is actually a parse-error in a cata reconstructor) • Dynamic parse-error (XSugar): • Dynamic transformation error: • impossible :-) Prototype *** In ln2l.x (4,4)-(4,7): Parse error at "Exp" *** Parse error at character 6 (line 1, column 7) in /tmp/shape84e645.txt Could be improved *** Parse error at character 23 (line 1, column 24) in /dev/stdin