190 likes | 269 Views
Matchete Paths through the Pattern Matching Jungle. Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek. 7+8 January 2008 PADL. Examples: Switch in C/Java Exception handlers ML-style patterns Regular expressions XPath patterns Bit masks. Selection If match, then execute handler
E N D
MatchetePaths through the Pattern Matching Jungle Martin HirzelNate NystromBard BloomJan Vitek 7+8 January 2008 PADL
Examples: Switch in C/Java Exception handlers ML-style patterns Regular expressions XPath patterns Bit masks Selection If match, then execute handler E.g. is this a float?22.341 Bindings Give names to parts E.g. integral part: 22, fractional part: 341 What is Pattern Matching?
Example: Lists -- list multiplication mult() = 3 * mult( ) = 3 * -1 * mult( ) = 3 * -1 * 0 * mult( ) = 3 * -1 * 0 * 4 * mult(nil) = 3 * -1 * 0 * 4 * 1 = 0 -- list construction cons(3, cons(-1, cons(0, cons(4, null)))) = 3 -1 0 4 -1 0 4 0 4 4 3 -1 0 4
Matching Structured Terms Selection int mult(List ls) { match(ls) {cons~(0, _): return 0;cons~(int h, List t): return h * mult(t);null: return 1; } return 1; } Bindings Central feature of ML, Haskell Hardly a jungle!
Why Unify? • Given list of strings: • Given String variable: name • Find name, extract int age • Match list deconstructor patterncons~(…, …) • Match string nested RegExp/([a-z]+) ([0-9]+)/(name, int age) sue 10 bob 15 ann 11
Matchete (Java Extension) • Integrates pattern sublanguages • Common set of primitive patterns • Nesting composite patterns • Simple uniform semantics
Composite Patterns Name Examples Deconstructor cons~(0, _) Parameterized re("([0-9]+)")~(int i) Array int[]{1, x, int y} XPath <bib/book>(NodeList n) RegExp /([a-z]) ([0-9]+)/(chr,int f) BitLevel [[(0x2cf9:16) 01 (int x:14)]]
Deconstructor Definition class List { private int head; private List tail; public List(int h, List t) { head = h; tail = t; } public cons~(int h, List t) { h = head; t = tail; }} Fields Constructor Deconstructor Match on receiver objectOut parameters = subjects for nested patterns
Nesting sue 10 bob 15 ann 11 cons~(/([a-z]+) ([0-9]+)/(name,int age),_) Deconstructor cons RegExp([a-z]+) ([0-9]+) Valuename Binderint age Wildcard_
Subjects flow to children sue 10 bob 15 ann 11 Deconstructor cons sue 10 RegExp([a-z]+) ([0-9]+) bob 15 ann 11 sue 10 Valuename Binderint age Wildcard_
Decisions and bindings flowto textual successor Deconstructor cons RegExp([a-z]+) ([0-9]+) Valuename Binderint age Wildcard_ Handlerprint(age)
Compilation Matchete source code Built on Rats!parser generator Matchete compiler OtherJava source Runtimelibrary GeneratedJava source Debugginginformation Java compiler Postprocessor Java class files
Implemented Examples • Balance red-black tree • Process TCP/IP network packet • Pretty-print XML bibliography • … + smaller regression tests
Discussion: Typing Matchete uses strong dynamic typing • No runtime errors, just failed matches • If Matchete compiler gives no error,then Java compiler gives no error either Why not (more) static typing? • Data formats mismatch • Test bed for a new scripting language
Discussion: Integration Choice Example Advantage Looseintegration /a(b)c(d)/ (p,q) Sublanguagereuse Tight integration /a(p:b)c(q:d)/ No need to count Nointegration re("a(b)c(d)")~(p,q) Simpler language Matchete choses tight integration for BitLevel,loose integration for RegExp and XPath,no integration for XML as terms
Related Work • Structured terms • Algebraic types: ML, Haskell, … • Objects: Tom, OOMatch, JMatch, … • Letting users define patterns: F#, Scala • Strings: Perl; SNOBOL • Bit-level data: Erlang; DataScript; PADS • XML: • As trees: XSLT, XJ (XPath) • As terms: XDuce, HydroJ, …
Conclusions • Pattern matching applies toterms, strings, XML, and raw bits • Matchete offers path to unification