1 / 45

PSoup

PSoup. Kevin Menard CS 561 4/11/2005. Slides are modified versions of the following original presentation:. Streaming Queries over Streaming Data. Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin. VLDB 2002. Result. Query. Psoup Insight #1.

lecea
Download Presentation

PSoup

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PSoup Kevin Menard CS 561 4/11/2005

  2. Slides are modified versions of the following original presentation: Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin VLDB 2002

  3. Result Query Psoup Insight #1 • Queries and data are duals • Store new queries, apply to data that arrived earlier • Store new data, apply to queries that arrived earlier Index Index Data Queries • Multiquery Processing = “join” of query and data • Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran

  4. Data Result Psoup Insight #1 • Queries and data are duals • Store new queries, apply to data that arrived earlier • Store new data, apply to queries that arrived earlier Index Index Data Queries • Multiquery Processing = “join” of query and data • Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran

  5. Motivation? • Why another model for continuous queries? • What is wrong with how Aurora and STREAM supply responses? Sirish Chandrasekaran

  6. Motivation: Disconnected Operation • Previous solutions stream out answers immediately Not feasible/suitable for all applications • Intermittent Connectivity: e.g., Applications on hand-held devices (as in this morning’s keynote address) • Even if connected: Not always interested in streaming answers Sirish Chandrasekaran

  7. Invoke } Register Psoup Insight #2 • Separate computation from delivery • Query answers continuously generated in background • Apply windows on-demand to transmit “current” results Query Data Queries ID Predicate ID R.a R.b T F T T T T T F Data F F F F F F T T Results Structure • Efficient support for disconnected operation • Low response time, Shared computation and storage across invocations Sirish Chandrasekaran

  8. PSoup Query Model SELECT select_list FROM from_list WHERE where_clause BEGIN begin_time END end_time • Where clause: conjunction of boolean factors • BEGIN-END clause: system clock or sequence numbers • (begin_time, end_time): • (constant, constant) – snapshot query • (constant, variable) – landmark window query • (variable, variable) – sliding window query Sirish Chandrasekaran

  9. Query Registration } SELECT select_list FROM from_list WHERE where_clause BEGIN begin_time END end_time Standing Query Clause (SQC) to the Symmetric Join } to the Windows_Table • QueryID: handle for future query invocations Sirish Chandrasekaran

  10. Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran

  11. Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 Select * From R Where R.a<=4 and R.b>=3 PSoup New query (b) Arrival of new Query Sirish Chandrasekaran

  12. Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 BUILD PSoup (c) Building Query Store Sirish Chandrasekaran

  13. Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b match 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 match 50 3 8 23 R.a=4 and R.b=3 PROBE 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 PSoup (d) Probing Data Store Sirish Chandrasekaran

  14. Selections over Single Stream: Arrival of New Query Specification Queries 20 21 22 23 24 48 4 3 48 ? 49 ? Data Results 50 3 8 50 ? 51 ? 52 ? Results Structure (e) Inserting Results Sirish Chandrasekaran

  15. Selections over Single Stream: Arrival of New Query Specification Queries 20 21 22 23 24 48 4 3 48 T 49 F Data Results 50 3 8 50 T 51 F 52 F Results Structure (e) Inserting Results Sirish Chandrasekaran

  16. Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran

  17. Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup New data 53 3 6 (b) Arrival of new Data Sirish Chandrasekaran

  18. Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 53 3 6 BUILD PSoup (c) Building Data Store Sirish Chandrasekaran

  19. Selections over Single Stream: Arrival of New Data Query Store Data Store match ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 match 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 PROBE 52 8 4 53 3 6 PSoup (d) Probing Query Store Sirish Chandrasekaran

  20. Selections over Single Stream: Arrival of New Data Queries 20 21 22 23 24 48 20 0<R.a<=5 49 Data Results 50 51 24 R.a<=4 and R.b>=3 52 53 ? ? ? ? ? Results Structure (e) Inserting Results Sirish Chandrasekaran

  21. Selections over Single Stream: Arrival of New Data Queries 20 21 22 23 24 48 20 0<R.a<=5 49 Data Results 50 51 24 R.a<=4 and R.b>=3 52 53 T F F F T Results Structure (e) Inserting Results Sirish Chandrasekaran

  22. Query Invocation • System returns the results corresponding to the current value of the BEGIN-END clause BEGIN begin_time END end_time Queries 20 21 22 23 24 48 T 49 F Data 50 T } Current Window 51 F 52 F 53 T F F F T Results Structure Sirish Chandrasekaran

  23. Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 PSoup (a) Initial State Sirish Chandrasekaran

  24. Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 New query PSoup 23 R.a<5 and R.a>S.a and S.b>1 (b) Arrival of new Query Sirish Chandrasekaran

  25. Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 BUILD PSoup (c) Building Query Store Sirish Chandrasekaran

  26. Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store Matches R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 PROBE 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (d) Probing R-Data Store Sirish Chandrasekaran

  27. Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 21 2 2 10 23 2>S.a and S.b>1 25 3 3 14 23 3>S.a and S.b>1 36 4 4 31 23 4>S.a and S.b>1 49 5 5 Query Store R-Data Store ID Predicate Matches 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (e) Constructing Hybrid Structs Sirish Chandrasekaran

  28. Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs Results ID S.a S.b Matches R.ID Q.ID Q.Predicate R,S,Q { 21 2 2 10 23 2>S.a and S.b>1 ? 25 3 3 PROBE 14 23 3>S.a and S.b>1 ? 36 4 4 31 23 4>S.a and S.b>1 ? 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (f) Probing S-Data Store Sirish Chandrasekaran

  29. Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs Results ID S.a S.b Matches R.ID Q.ID Q.Predicate R,S,Q { 21 2 2 10 23 2>S.a and S.b>1 14,21,23 25 3 3 PROBE 14 23 3>S.a and S.b>1 31,21,23 36 4 4 31 23 4>S.a and S.b>1 31,25,23 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (f) Probing S-Data Store Sirish Chandrasekaran

  30. Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b PSoup (a) Initial State Sirish Chandrasekaran

  31. Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b PSoup New data 53 5 4 (b) Arrival of new Data Sirish Chandrasekaran

  32. Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 BUILD PSoup (c) Building R-Data Store Sirish Chandrasekaran

  33. Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Matches Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 PROBE 23 R.a<4 and R.b<S.b 53 5 4 PSoup (c) Probing Query Store Sirish Chandrasekaran

  34. Joins over R and S: Arrival of New Data S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 48 4 4 ? ? 4<S.b 49 5 3 53 21 ? 52 3 2 53 22 ? R-Data Store Query Store Matches ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (d) Constructing Hybrid Structs Sirish Chandrasekaran

  35. Joins over R and S: Arrival of New Data S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 48 4 4 53 20 4<S.b 49 5 3 53 21 4<S.b and S.a<10 52 3 2 53 22 10>S.a and S.b>2 R-Data Store Query Store Matches ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (d) Constructing Hybrid Structs Sirish Chandrasekaran

  36. Joins over R and S: Arrival of New Data Results S-Data Store Hybrid Structs R,S,Q ID S.a S.b R.ID Q.ID Q.Predicate } Matches 53,48,22 48 4 4 53 20 4<S.b 53,49,22 49 5 3 PROBE 53 21 4<S.b and S.a<10 52 3 2 53 22 10>S.a and S.b>2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (e) Probing S-Data Store Sirish Chandrasekaran

  37. Other Queries • N-way Joins • Similar to 2-way joins • Probe, generate hybrid structs, repeat • Can be executed without intermediate tables • Aggregations • Performed at query invocation • Uses n-ary ranked tree, clustered on time Sirish Chandrasekaran

  38. Telegraph Background: CACQ • CACQ [MSHR02] • Shared execution of multiple queries with one Eddy • Tuple lineage • Query Indices • Queries and Data treated very differently • Only Landmark Continuous Queries • No support for disconnected operation Sirish Chandrasekaran

  39. PSoup in Telegraph • Leverage SteMs to store and index queries • Changes to Eddies • Encode queries as tuples • break Where clause into individual boolean factors (BF) • encode each BF as R.a relop [R.b|S.b] [+|-] constant • Stream Prefix Consistency • A new query or data tuple is completely processed before any other tuple: no holes in Result Structure. • Results Structure: to buffer the results. Sirish Chandrasekaran

  40. Experiments and Results • Alternatives • NoMat – No background processing • PSoup-Partial – background processing, apply current window on invocation • PSoup-Complete – current windows are also continuously applied in the background • Experimental Parameters • Unloaded Server with two Intel Pentium III, 666 MHz processors with 768 MB RAM • Data arrives as fast as possible, in domain [0,255] • Queries of form R.a relop C, where c in [0,255] • Join Queries of form R.a relop S.b +/- C. Sirish Chandrasekaran

  41. Experiments: Response Time vs. Window Size • Interval Predicates, Selection Queries Sirish Chandrasekaran

  42. Experiments: Response Time vs. Window Size • Equality Predicates, Selection Queries Sirish Chandrasekaran

  43. Experiments: Max data arrival rate vs. #SQCs • Window Size = 1000 tuples Sirish Chandrasekaran

  44. PSoup in traditional query processor • PSoup = SQL QUERY over data and client query streams? • Joins = expression evaluators • Notes • Conventional QPs do not have tuple lineage • Conventional QPs always use intermediate tables Sirish Chandrasekaran

  45. Conclusions • Treating Queries and Data the same • Combines approaches for previously studied queries • Queries over the past and continuous queries • Allows new functionality – hybrid queries • Separating Result Generation and Delivery • Makes disconnected operation feasible • Efficient support for repeated query invocations Sirish Chandrasekaran

More Related