410 likes | 422 Views
Derive an executable representation of a query in SeQUEL (HSQL) that is suitable for a dynamic service-based environment, with a well-defined semantics, optimal execution, and expressed in a stand-alone language.
E N D
Problem statement • Given a query in SeQUEL (resp. HSQL) derive an executable representation
Desirable properties • Fit to a dynamic service-based environment • Well-defined semantics • Suitable for implementation yet human readable • Preferably yields optimal execution • Not too costly to derive • Expressed in a stand-alone language
Datamodel uses SeQUELquery transform transform uses Algebraicexpression Queryworkflow transform build Service coordination
Data model • Complex values: nested tuples, sets, and basic types • Restriction*: flat tuples of atomic complex values (≈relations) * In terms of full support currently
Algebra Supported Formalized • Equi-join • Bind-join • Time-based window • Tuple-based window • Recursive selection • Recursive projection • Recursive renaming • Group • Ungroup • Union • Intersection • Set difference • Equi-join • Bind-join • Time-based window • Tuple-based window • Selection • Projection
SeQUEL(HSQL) • SELECT-FROM-WHERE clauses • + windows and function calls Compatible with supported operators
Query workflow model • Formalized by a graph model • Equivalent to algebraic expressions: expression ↔ workflow ↔ function composition • Transformation formalized by construction rules
Service coordination • Workflow activities are bound to • Data services • Simple and composite computation services • Binding to data services is enabled by a registry • Binding to computation services is static
SeQUEL query • Identify join attributes • Create hypregraph • Produce parse tree • Traverse parse tree Join algebraic expression Join workflow Complete query workflow Service coordination
Data services • Service interface → n data operations • Binding-pattern representation • Example
Example query Find friends which are no more than 3 km away from my current location (48.85889, 2.29583) considering their locations of the last 10 min, which are also over 21 years old and that are interested in art. SELECT p.nickname, p.age, p.gender, p.email FROM profile AS p, location [range 10] AS l, interests AS i WHERE p.age >= 21 AND l.nickname = p.nickname AND i.nickname = p.nickname AND i.tag=’art’ AND distance(lat, lon, 48.85889, 2.29583) <= 3.0;
Identify join attributes • Generate symbols for join attributes l.nickname = p.nickname AND i.nickname = p.nickname
Create a hypergraph N L N N N L L L L’ L’ L’ L’ N N N A A G G E E N A A G G E E N N N N T T T S S S S T { , , , , , , , } X = E = {{ , , }, { , , , }, { , , }}
Produce and traverse parse tree N A S E consume N L L’ N N T A S N LL’ N T S G E ((NLL’⋈NASE)⋈NTS) Parse tree traversal algorithm ↺
SeQUEL query • Translate to postorder • Construct workflow by stack-based evaluation Join algebraic expression Join workflow Complete query workflow Service coordination
Construct workflow by stack-based evaluation ((NLL’⋈NASE)⋈NTS) NLL’ NAGE ⋈NTS ⋈
Construct workflow by stack-based evaluation NTS NAGE NLL’ NAGE ⋈NTS ⋈ NLL’ NAGE ⋈NTS ⋈ NLL’⋈NAGE NLL’ location(N L L’) ⋈ profile (N A G E) ⋈ interests (N T S)
SeQUEL query • Apply rules and heuristics Join algebraic expression Join workflow Complete query workflow Service coordination
Rules for additional operators • Place windows next in sequence to data stream operations • Selections are pushed-down • function calls may involve multiple attributes • A single projection operation is added at the end * Additional heuristics could be considered
Example π σdist location location [time win] σtag ⋈ ⋈ σage ⋈ ⋈ profile profile interests interests
SeQUEL query • Bind workflow activities to services Join algebraic expression Join workflow Complete query workflow Service coordination
Example π π σdist σdist location location [time win] [time win] σtag σtag ⋈ ⋈ σage σage ⋈ ⋈ profile profile Dynamic binding to data services and computation services Workflow of data operations and query operators interests interests
Service provisioning Interface operation 1 Genericserviceinstance operation 2 On-demand data services Composite computation services Stream data services Simple computation services . . . operationi(in1, in2, ..) → output operation n π σdist location [time win] σtag ⋈ σage ⋈ profile interests
Data services • Stream data service tn=〈nick:Mike, coor:〈lat,lon〉〉 . . . location subscribe(dest, time) t2=〈nick:Bob, coor:〈lat,lon〉〉 t1=〈nick:Alice, coor:〈lat,lon〉〉 • On-demand data service out ={〈nick:Bob, age:23, sex:M,…〉} profile profile(nick:’Bob’)
Simple computation service σdistance σdist [time win] mycoor 〈nick:Bob,coor〉 out <= 3 input() Geo-distance tuple 〈nick:Bob, coor〉 out = distance(lat1, lon1, lat2, lon2)
Composite computation service Symmetric hash join service [⋈] 〈inputTuple1〉 σdist inputOp1() 〈outTuple(s)2〉 〈outTuple(s)1〉 ⋈ profile inputOp2() 〈inputTuplen〉 Hash-index1 Hash-index2 hash probe hash probe
insert_tuple createHIs N uninitialized negTuple probeRight Y delete_tuple N endofInput leftInput endofInput active Y insert_tuple destroyHIs N terminated negTuple probeLeft Y delete_tuple Composite computation service • Workflow model based on Abstract State Machines (ASM) • Parallel and sequential composition, conditionals, and iteration match Y rightInput N match
Service interoperation and communication 〈outTupleA〉 〈inputTupleA〉 Comp.service A inputOp1() inputOp1() Comp.service C . . . 〈inputTupleB1〉 Serv. inst(s) inputOp2() inputOp1() Comp.service B 〈outTupleB〉 inputOp2() A 〈inputTupleB2〉 C B
Example query SELECT * FROM ABC, BF, BCD, DEG, CDE, BH WHERE ABC.B1 = BCD.B3 and ABC.C1 = BCD.C3 and BCD.B3 = BF.B2 and BCD.C3 = CDE.C5 and BCD.D3 = CDE.D5 and DEG.D4 = CDE.D5 and DEG.E4 = CDE.E5 and BCD.B3 = BH.B6;
Attribute symbol generation ABC(A1!, B1!, C1!) ABC(A1!, B!, C!) ABC.B1 = BCD.B3 ABC.C1 = BCD.C3 BCD.B3 = BF.B2 BCD.C3 = CDE.C5 BCD.D3 = CDE.D5 DEG.D4 = CDE.D5 DEG.E4 = CDE.E5 BCD.B3 = BH.B6 BF(B2?, F2!) BF(B?, F2!) BCD(B3?, C3?, D3!) BCD(B?, C?, D!) DEG(D4!, E4!, G4!) DEG(D!, E!, G4!) CDE(C5!, D5!, E5!) CDE(C!, D!, E!) BH(B6!, H6!) BH(B!, H6!)
Join dependences as a hypergrph ABC(A1!, B!, C!) ABC.B1 = BCD.B3 ABC.C1 = BCD.C3 BCD.B3 = BF.B2 BCD.C3 = CDE.C5 BCD.D3 = CDE.D5 DEG.D4 = CDE.D5 DEG.E4 = CDE.E5 BCD.B3 = BH.B6 ✓ A1 H6 ✓ BF(B?, F2!) BCD(B?, C?, D!) ✓ B C ✓ DEG(D!, E!, G4!) ✓ CDE(C!, D!, E!) E F2 ✓ BH(B!, H6!) D G4
Parse tree construction A1 H6 DEG A1!B!C!D!E!F2!G4!H6! D!E!G4! B C CDE A1!B!C!D!E!F2!H6! C!D!E! E F2 D BCD B?C?D! A1!B!C!D! A1!B!C!D!F2! A1!B!C!D!F2!H6! ISOLATED:{B} ISOLATED:{C} ISOLATED:{F2} ISOLATED:{A1} ISOLATED:{H6} G4 SHARED:{B} SHARED:{D,E} SHARED:{C,D} SHARED:{B,C} SHARED:{B} DEG(D!, E!, G4!) ABC(A1!, B!, C!) ABC BF BH B!H6! B?F2! A1!B!C! CDE(C!, D!, E!) BF(B?, F2!) BH(B!, H6!) BCD(B?, C?, D!)
Join expression construction ⋈ DEG A1!B!C!D!E!F2!G4!H6! ⋈ DEG CDE A1!B!C!D!E!F2!H6! ⋈ CDE BCD ⋈ BH A1!B!C!D!F2!H6! ⋈ BF ABC BF BH B!H6! B?F2! A1!B!C! ABC BCD Root: ( DEG ⋈ ( CDE ⋈ ( ( ( BCD ⋈ABC ) ⋈ BF ) ⋈ BH ) ) ) Each node denotes a join expression
Join expression construction Inorder expression: ( DEG⋈( CDE⋈( ( ( BCD⋈ABC )⋈BF )⋈BH ) ) ) Postorder expression: DEG CDE BCD ABC⋈BF⋈BH⋈⋈⋈ Process postorder expression using a stack to generate the join workflow
Workflow generation ⋈ ⋈ ⋈ ⋈ ⋈ DEG CDE BCD ABC BF BH DEG CDE BH BF ABC ⋈ ⋈ ⋈ ⋈ BCD ⋈ ABC BF BH BCD⋈ABC CDE⋈( ( ( BCD⋈ABC )⋈BF )⋈BH ) (BCD⋈ABC)⋈BF ( DEG⋈( CDE⋈( ( ( BCD⋈ABC )⋈BF )⋈BH ) ) ) ( ( BCD⋈ABC )⋈BF )⋈BH BCD CDE DEG
Join expression construction createJoinTree( Node node ) { if ( node.children() = ∅ ) return node; else { Queue q := new Queue(); foreach( child ∈ node.children() ) q.add(child); while( ⌐ queue.isEmpty() ) { Node leftNode := node; Node rightNode := createJoinTree(queue.remove()); Node joinedNode := joinNodes(leftNode, rightNode); node := joinedNode; } return node; } } ↺