340 likes | 472 Views
Processing Nested Complex Sequence Pattern Queries over Event Streams. Mo Liu 1 , Medhabi Ray 1 , Elke A. Rundensteiner 1 , Dan Dougherty 1 , Chetan Gupta 2 , Song Wang 2 , Ismail Ari 3 , and Abhay Mehta 2 1 Worcester Polytechnic Institute, USA 2 HP Labs, USA 3 Ozyegin University, Turkey
E N D
Processing Nested Complex Sequence Pattern Queries over Event Streams Mo Liu1, Medhabi Ray1, Elke A. Rundensteiner1, Dan Dougherty1, Chetan Gupta2, Song Wang2, Ismail Ari3, and Abhay Mehta2 1Worcester Polytechnic Institute, USA 2HP Labs, USA 3Ozyegin University, Turkey DMSN 2010 Singapore Acknowledgements: This work is partly supported by HP Innovations Award, NSF 1018443 and NSF IIS 0917017, Turkish National Science Foundation TUBITAK under career award 109E194.
Event Processing—The Big Picture Event Processing Event Consumer Event Producer
Hospital Disease and Hygiene Control Data Sources Put on mask for H1N1 contagious patients RFID Input Wash your hands before touching next patients Query Results RFID Input RFID Input Put on surgical gloves Data Streams Track workers RFID Input Detect hygiene violations Aggregate statistics for a hospital D. Wang, E. Rundensteiner, R. Ellison III, Active complex event processing: applications in realtime health care, VLDB (demonstration paper), 2010.
Primitive event instance is defined to be an occurrence of interest in time. Composite event instance occurs over an interval. CEP Basics e(t) time t e([t1, t2]) t2 t1 time
Outline • Motivation • NEEL: The Nested Complex Event Language • Nested CEP Query Processing • Performance Evaluation • Nested Query Optimization with Evaluation • Conclusion
Why Nested Queries? • Compact • Incremental • Convenient + + + + +
NEEL: The Nested Complex Event Language • Support nested SEQ, NEGATION , AND, OR PATTERN<event-expression> WITHIN<window> • Specify condition on attributes • Assume value-based comparison • Specify time period NEEL: The Nested Complex Event Language for Real-Time Event Analytics, Mo Liu, Elke A. Rundensteiner, Dan Dougherty, Chetan Gupta, Song Wang, Ismail Ari, and Abhay Mehta, BIRTE2010
NEEL: The Nested Complex Event Language Time Nested sub-query
Nested CEP Query Plan Complex Events (r.id = w.id = o.id) WinSeq(Recycle r, Washing w, , Operating o) Operating Washing Recycle (s.id = d.id = c.id = o.id) WinAND(Sharpening s, Disinfection d, Checking c) Checking Sharpening Disinfection RFID readings
Outline • Motivation • NEEL: The Nested Complex Event Language • Nested CEP Query Processing • Processing Nested Queries with Negation • Processing Nested Queries with Predicate • Performance Evaluation • Nested Query Optimization with Evaluation • Conclusion
Nested CEP Query Processing PATTERN SEQ(Recycle r, Washing w, SEQ(Sharpening s Disinfection d, Checking c), Operating o) WITHIN 10 minutes Complex Events WinSeq(Recycle r, Washing w, , Operating o) Recycle Washing Operating WinSeq(Sharpening s, Disinfection d, Checking c) Checking Sharpening Disinfection
Nested CEP Query Processing partial outer query result <r1, w2, o18> <r1, w12, o18> <r5, w12, o18> r1 w2 r5 Recycle w12 o18 Washing Operating WinSeq s3 d10 s11 c12 Sharpening c16 Disinfection Checking WinSeq [ECUBE] M. Liu, E. A. Rundensteiner, K Greenfield, C Gupta, S Wang, I Ari and A Mehta " E-Cube: Multi-Dimensional Event Sequence Processing Using Concept and Pattern Hierarchies, ICDE'10 (DEMO)
Nested CEP Query Processing <r1, w2, s3, d10, c12 , o18> <r1, w2, s3, d10, c16 , o18> partial outer query result r1 <r1, w2, o18> w2 r5 Recycle w12 o18 Washing Operating WinSeq tightened sub-window [2, 18] inner query results s3 d10 <s3, d10, c12> s11 c12 <s3, d10, c16> Sharpening c16 Disinfection Checking WinSeq
Nested CEP Query Processing partial outer query result <r1, w2, o18> <r1, w12, o18> r1 <r5, w12, o18> w2 r5 Recycle w12 o18 Washing Operating WinSeq s3 d10 s11 c12 Sharpening c16 Disinfection Checking WinSeq
Nested CEP Query Processing partial outer query result <r1, w12, o18> r1 w2 r5 Recycle w12 o18 Washing Operating WinSeq (outer) tightened sub-window [12, 18] s3 d10 s11 c12 inner query results Sharpening c16 Empty Disinfection Checking WinSeq(inner)
Processing Nested Queries with Negation • Bounded by outer query. PATTERN SEQ(Recycle r, Washing w, ! SEQ(Sharpening s, Disinfection d, Checking c), Operating o) [w, o] WITHIN 10 minutes
Processing Nested Queries with Negation PATTERN SEQ(Recycle r, Washing w, !SEQ(Sharpening s, Disinfection d, Checking c), Operating o) outer query result <r1, w2, o18> <r1, w12, o18> <r5, w12, o18> r1 w2 r5 Recycle w12 o18 Washing Operating WinSeq(outer) s3 d10 s11 c12 Sharpening c16 Disinfection Checking WinSeq(inner)
Processing Nested Queries with Negation outer query result <r1, w2, o18> r1 w2 r5 Recycle w12 o18 Washing Operating WinSeq(outer) tightened sub-window [2, 18] s3 (s3) d10 s11 c12 inner query results Sharpening c16 <s3, d10, c12> Disinfection Checking WinSeq(inner) <s3, d10, c16> not empty inner
Processing Nested Queries with Negation outer query result <r1, w2, o18> r1 <r1, w12, o18> w2 r5 <r5, w12, o18> Recycle w12 o18 Washing Operating WinSeq(outer) s3 d10 s11 c12 Sharpening c16 Disinfection Checking WinSeq(inner)
Processing Nested Queries with Negation outer query result <r1, w12, o18> r1 w2 r5 Recycle w12 o18 Washing Operating WinSeq(outer) tightened sub-window [12, 18] s3 d10 s11 c12 inner query results Sharpening c16 empty Disinfection Checking WinSeq(inner)
Processing Nested Queries with Negation • Bounded by adjacent query. PATTERN SEQ(Recycle b, SEQ(Washing w, !Sharpening s), SEQ(Disinfection d, Checking c), Operating o) [w, d] WITHIN 10 minutes Challenge: Not yet known bounds at time of subquery processing, as mutually dependent subqueries.
Processing Nested Queries with Negation PATTERN SEQ(Recycle b, SEQ(Washing w, !Sharpening s), SEQ(Disinfection d, Checking c), Operating o) outer query result <r1, o18> r1 Recycle o18 Operating WinSeq (outer) w2 d6 s11 c12 w12 c16 Sharpening Washing Disinfection Checking WinSeq (inner) WinSeq (inner)
Processing Nested Queries with Negation outer query result r1 <r1, o18> o18 Recycle WinSeq (outer) Operating Tightened sub-window [1, 18] Output: <r1, w2, d6, c12, o18> Resolve at upper level <r1, w2, d6, c16, o18> Potential query results <w2> + (n. <s11>) Inner query results <d6, c12>, <d6, c16> w2 d6 s11 c12 w12 c16 Sharpening Washing Disinfection Checking WinSeq(inner) WinSeq(inner)
Nested CEP Query Processing with Predicates PATTERN SEQ(Recycle r, Washing w, SEQ(Sharpening s Disinfection d, Checking c, s.id = d.id = c.id = o.id), Operating o, r.id = w.id = o.id) WITHIN 10 minutes Pass down interval attribute values from outer to inner; Resolve predicate correlation as early as possible.
Outline • Motivation • NEEL: The Nested Complex Event Language • Nested CEP Query Processing • Performance Evaluation • Nested Query Optimization with Evaluation • Conclusion
Experimental Setup • Implemented in ECube Event Analytics System • Use real stock trades data • Sample queries [ECube] M. Liu, E. A. Rundensteiner, K. Greenfield, C. Gupta, S. Wang, I. Ari, and A. Mehta, “E-Cube: Multi-dimensional event sequence processing using concept and pattern hierarchies,” in ICDE, 2010, pp. 1097–1100. [Stock] “I. inetats. stock trade traces. http://www.inetats.com/.”
Performance Evaluation Observation: Clearly, children number, query length and nesting levels impact query performance.
Outline • Motivation • NEEL: The Nested Complex Event Language • Nested CEP Query Processing • Performance Evaluation • Nested Query Optimization with Evaluation • Conclusion
Inefficiency in Nested CEP Query Processing Outer Query Block Inner-Query Block <w, c> PATTERN SEQ(Recycle r, Washing w, , Checking c, Operating o) Inner-query block SEQ(Sharpening s, Disinfection d) WITHIN 10 minutes Help to avoid duplicate invocations • Observations: • For one outer invocation, the outer matches may share same <w, c>. • Different outer invocations may invoke inner subquery with same <w, c>. Selective Result Caching
Nested Query Optimization • Cache Design. Semantic descriptor interval [leftbound, rightbound] indicates the time validity of the current cache content. • Cache Usage. • Reuse cached results if the query interval matches the cache interval. • Cache Maintenance. • Interval-driven Cache Expansion: Stream insertion • Interval-driven Cache Reduction: Window purge
Preliminary Result: Evaluating Optimized Nested Execution Observation: Selective result caching significantly improves performance. • Future work: • Refine caching to consider full NEEL query features: negation. • Design additional optimization strategies: query decorrelation.
Related Work • Complex Event Processing Systems [SASE, CEDR…] • Query Decorrelation [Decorrelation96, Dayal87…] • Semantic Caching [Dar96, Chen02, …] [SASE] E. Wu, Y. Diao, and S. Rizvi, High-performance complex event processing over streams, SIGMOD, 2006, pp. 407-418. [CEDR] R. S. Barga, J. Goldstein, M. Ali, and M. Hong, Consistent streaming through time: A vision for event stream processing, CIDR, 2007, pp. 363-374. [Decorrelation96] P. Seshadri, H. Pirahesh, and T. Y. C. Leung, “Complex query decorrelation,” in ICDE ’96: Proceedings of the Twelfth International Conference on Data Engineering. IEEE Computer Society, 1996, pp. 450–458 [Dayal87] U. Dayal, “Of nests and trees: A unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers,” in VLDB, 1987, pp. 197–208. [Dar96] S. Dar, M. J. Franklin, B. T. J´onsson, and etc, “Semantic data caching and replacement,” in VLDB, 1996, pp. 330–341. [Chen02] L. Chen, E. Rundensteiner, and etc, “Xcache: A semantic caching system for xml queries,” in ACM SIGMOD, 2002, pp. 618–618.
Conclusion • Introduce an algebraic query plan for queries expressed in NEEL. • Design an iterative execution strategy for NEEL. • Evaluate our proposed execution strategy on real data streams. • Demonstrate promise of selective caching in NEEL execution.