140 likes | 302 Views
Analisi dei Tipi di Dato di XDuce e Implementazione dei Relativi Algoritmi. Fabrizio Bisi. Web Services. orchestration. XML. MS Highwire. Bologna Pi. MS Biztalk. fusion calc. Hipi. xlang. linear logic. tree regular expressions. graphical IDE.
E N D
Analisi dei Tipi di Dato di XDuce e Implementazione dei Relativi Algoritmi Fabrizio Bisi
Web Services orchestration XML MS Highwire Bologna Pi MS Biztalk fusion calc. Hipi xlang linear logic tree regular expressions graphical IDE (by Hosoya)XDuce = tre + MLCDuce = tre + higher-orderXStatic = tre + C# XRel = tre in Java. my thesis!
Regular Expression Types XML Value Tree Regular Expression Value <person> <name> <first>Fabrizio</first> <last>Bisi</last> </name> <email>bisi@cs.unibo.it</email> <email>bisif@tin.it</email> <telephone/></person> person[ name[ first["Fabrizio"], last["Bisi"] ], email["bisi@cs.unibo.it"], email["bisif@tin.it"], telephone[]] DTD: <!ELEMENT person (name, email*, tel?) > <!ELEMENT name #PCDATA > <!ELEMENT email #PCDATA > <!ELEMENT tel #PCDATA > Tree Regular Expression Types: typedef Person = Name, Email*, Tel?; typedef Name = name[String]; typedef Email = email[String]; typedef Tel = tel[String];
Regular Expression Types XML Value Tree Regular Expression Value <person> <name> <first>Fabrizio</first> <last>Bisi</last> </name> <email>bisi@cs.unibo.it</email> <email>bisif@tin.it</email> <telephone/></person> person[ name[ first["Fabrizio"], last["Bisi"] ], email["bisi@cs.unibo.it"], email["bisif@tin.it"], telephone[]] typedef X = a[f],a[g]; DTD: <!ELEMENT person (name, email*, tel?) > <!ELEMENT name #PCDATA > <!ELEMENT email #PCDATA > <!ELEMENT tel #PCDATA > Tree Regular Expression Types: typedef Person = Name, Email*, Tel?; typedef Name = name[String]; typedef Email = email[String]; typedef Tel = tel[String];
Syntax T ::= () | l[T] | X| T , T| T | T| T* | T+ | T? empty sequence labelling type name sequence choice repetition one or more zero or one Types are non recursive at top-level
Syntax T ::= () | l[T] | X| T , T| T | T| T* | T+ | T? empty sequence labelling type name sequence choice repetition one or more zero or one variable binder P ::= () | l[P] | X| P , P| P | P| P* | P+ | P?| P as x Types are non recursive at top-level Pattern must be linear
XRel: XML Regular Expression Language input pattern import a[f|g],b[f|g] asv; typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } pattern matching statement • XML processing language • Test harness for tree regular expression types and pattern matching
Subtyping Then v is an element of this set: { a[f],b[f], a[f],b[g], a[g],b[f], a[g],b[g] } XML types are regular expressions which denote sets. Subtyping is set-inclusion. import a[f|g],b[f|g] asv; typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); }
PATTERN-MATCHING ALGORITHM • (from Hosoya) • Turn the pattern into a tree-automatonwhich matches this particular tree-regexp • Annotate the tree-automaton with binders • Check whether v is accepted by the automaton. If so, collect the binders. import a[f|g],b[f|g] asv typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } PAT1: v a[Y] seq x:b[Y] PAT1 x a[] b[] f[.] Y f f
PAT0: • IRREDUNDANCY • (from Hosoya) • Take the full automata of v • Subtract previous pattern PAT1 • Intersect with PAT2 • If the result is empty, PAT2 is redundant • (from Fabrizio) • The same, but using a correct subtract algorithm! • “difference algorithm” import a[f|g],b[f|g] asv typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } PAT1: PAT2: PAT0 PAT1 a[f],b[f]a[f],b[g]a[g],b[f]a[g],b[g] a[f],b[f] a[g],b[g] Hosoyas articles say thatpattern2 is redundant! (It’s not. And the XDucecode doesn’t think so either). Hosoya: a[f|g],b[f|g] a[f],b[f] a[g],b[g] PAT2 a[g],b[g] a[f],b[g] a[g],b[g] a[f],b[g] ε
PAT0: • IRREDUNDANCY • (from Hosoya) • Take the full automata of v • Subtract previous pattern PAT1 • Intersect with PAT2 • If the result is empty, PAT2 is redundant • (from Fabrizio) • The same, but using a correct subtract algorithm! • “difference algorithm” import a[f|g],b[f|g] asv typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } PAT1: PAT2: PAT0 PAT1 a[f],b[f]a[f],b[g]a[g],b[f]a[g],b[g] a[f],b[f] a[f],b[g]a[g],b[f]a[g],b[g] Hosoyas articles say thatpattern2 is redundant! (It’s not. And the XDucecode doesn’t think so either). Fabrizio: a[f|g],b[f|g] a[f],b[f] a[(f|g)\f],b[f|g] |a[f|g],b[(f|g)\f] PAT2 a[f],b[g]a[g],b[f]a[g],b[g] a[g],b[g] a[g],b[g] … a[g],b[g] a[g],b[g]
AMBIGUITY • (from Hosoya) • Strong: the pattern is ambiguous whenever a value has more than one decomposition into the regexp. (requires epsilon-transitions to be retained in the automaton, to retain ambiguity). • (from Fabrizio) • Weak: the pattern is ambiguous whenever a value has more than one way of traversing the labelled transitions. (no need for epsilon-transitions). • Justification: by linearity, there weren’t in any case going to be any binders under the * import a[f|g],b[f|g] asv typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } PAT3: Hosoya: PAT3 is ambiguous, because a[g],a[g] can be matched by inner* used once,and outer* used twice or by inner* used twice,and outer* used once. Fabrizio: PAT3 is unambiguous, because the automata for PAT3 hasonly one transition labelled a[g]. Therefore, in any way of matching thevalue a[g],a[g], it always involves goingthrough this transition twice.
PAT0: • EXHAUSTIVITY • (from Hosoya) • Take the full automata of v • Subtract all patterns PAT1, PAT2, PAT3 • If the result is empty, the typeswitch was exhaustive. • (from Fabrizio) • This again suffers from Hosoya’s subtract error. import a[f|g],b[f|g] asv typeswitch (v) { case a[f],(b[f] as x) : printf(x); case a[f],b[g] : … case (a[g]* | b[g]*)*as x : printf(x); } PAT1: PAT2: PAT3: PAT0 PAT1 PAT2 PAT3 a[g],b[f] a[f],b[f]a[f],b[g]a[g],b[f]a[g],b[g] a[f],b[f] a[f],b[g] εa[g]a[g],b[g]… a[g],b[f] a[f|g],b[f|g] a[f],b[f] a[f],b[g] (a[g]*,b[g]*)* Compile-time error: thistypeswitch is not exhaustive.
Conclusions • XDuce analysis and re-implementation: • static type checks • run-time pattern matching • New weak ambiguity check algorithm • Difference algorithm error detection & fix