1 / 51

Event Stream Processing with Out-of-Order Data Arrival

Explore solutions for processing event streams with out-of-order data arrivals, including analysis of limitations, goals, and contributions. Presentation covers problem identification, implementation, experiments, and related works.

bonniedavis
Download Presentation

Event Stream Processing with Out-of-Order Data Arrival

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Event Stream Processing with Out-of-Order Data Arrival Ming Li and Mo Liu Department of Computer Science Worcester Polytechnic Institute Worcester Massachusetts CS 525 Project Final Presentation 12.14 2006

  2. Outline • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  3. Outline • Event Stream Processing • SASE System • Limitation of SASE • Goal and Contribution • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  4. Introduction: Event Stream Processing • Raising interest on the database community • Wild-range and growing applications Sensor isn’t Moving: (Shelf, !CheckoutCounter, Exit) Sensor is Moving: (police vehicle, ambulance, Reporter Vehicle) Traffic Control Retail Management

  5. Introduction:SASE System • Event Stream Processing Engine • Stream engine specific for even stream query: generic for detecting and extracting expected pattern sequence • Performance gain compared to stream system using joins to handle event sequence query SASE Approach TelegraphCQ Approach

  6. Introduction:SASE System (Cont.) a3 b6 d10 …… a3 b6 d10 …… TF: sequence to composite event NG: !C (B.time<C.time<D.time a3 b6 d10 a7 b11 d15 …… WD in SC: D.time – A.time < 10secs SC (A, B, D) a3 b6 d15 a3 b11 d15 will not be selected SSC EVENT SEQ(A, B, !C, D) WITHIN 10 seconds WD in SS: W = 10 SS (A, B, D) Event Stream b aca… b a c b a d f c d 1 11 3 5 6 7 10 12 13 15 Timestamp 16 17 18…

  7. Introduction:SASE System (Cont.) <a(2) b(2) d(2)> …… a(2) b(2) d(2) …… TF: sequence to composite event NG: !C (B.time<C.time<D.timeΛB.attr_1 = C.attr_1) a(2) b(2) d(2) a(3) b(3) d(3) …… SL: [attr] a(2) b(2) d(2) a(3) b(3) d(3) …… WD in SC: D.time – A.time < 10secs SC (A, B, D) SSC EVENT SEQ(A, B, !C, D) WHERE [attr_1] WITHIN 10 seconds WD in SS: W = 10 SS (A, B, D) Event Stream b(3) a(4)c(4)a(5)… b(1) a(2) c(2) b(2) a(3) d(2) f(3) c(3) d(3) 1 11 3 5 6 7 10 12 13 15 Timestamp 16 17 18…

  8. Introduction:Limitation in SASE • Total Order Assumption in event arrivals • Order in which the events are received by the query system is the same as their timestamp order • By this assumption, “later arrival” means “larger timestamp” • Example • e1.timestamp = 5:15pm e1.received_time = 5:17pm • e2.timestamp = 5:19pm e2.received_time = 5:20pm e2 • e2 is received later than e1  e2’s timestamp is larger than e1 • In the Case of Out-of-Order Event Arrival • Missing result • Spurious result • Unbound memory requirement

  9. Introduction:Goal and Contributions • Goal • Proposing solution to handle the sequence query processing with out-of-order event arrival • Contributions • Study the problem with OOO event arrival • Solution framework on all the problems • Solution on Sequence Scan • Solution on Negation • Solution on Window in SS

  10. Outline • Event, Event Stream and Query • SASE Evaluation - SSC • SASE Evaluation - Negation • SASE Evaluation - Window • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  11. Preliminary:Event, Event Stream and Event Sequence Query Language • Event and Event Stream • An event is defined to be an instantaneous, atomic (happens completely or not at all)occurrence of interest at a point in time • Each event, denoted by a lower case letter (e.g., “a”), consists of the name of its type, denoted by a upper case letter and a set of values corresponding to the attributes in the type. • Each Event is with a timestamp under the total order assumption • Event stream: containing different event types • Example: a(attr_1 = 2, timestamp = 4), c(attr_1 = 1, timestamp = 5)… • SASE Query Language • EVENT <event pattern> [WHERE <qualification>] [WITHIN <window>] Example: EVENT SEQ(A, B, !C, D) WHERE [attr_1] WITHIN 10 seconds

  12. Preliminary:SASE Evaluation – SSC • SSC: SS (Sequence Scan) and SC (Sequence Construction) • NFA with AIS (Active Instance Stack) • RIP (most Recent Instance in Previous stack) field • Example EVENT SEQ(A, B, D) WITHIN 10 Seconds * * A B D 0 1 2 3 [] a3 [a3] b6 [b6] d10 a3 b6 d10 [] a7 [a7] b11 [b11] d15 a3 b6 d15 a3 b11 d15 a7 b11 d15 [] a16 S1 S2 S3 b aca… b a c b a d f c d 1 11 3 5 6 7 10 12 13 15 Timestamp 16 17 18…

  13. Preliminary:SASE Evaluation – NG • Negation (NG) • Example EVENT SEQ(A, B, !C, D) WITHIN 10 Seconds b aca… b a c b a d f c d 1 11 3 5 6 7 10 12 13 15 Timestamp 16 17 18… a3 b6 d10 [3, 10] √ a7 b11 d15 [10, 15] Χ

  14. Preliminary:SASE Evaluation – Purge • Purge • Purge in SSC • Purge in NG • Example EVENT SEQ(A, B, D) WITHIN 10 Seconds * * A B D 0 1 2 3 PG in SS: You see d15 Purge a3 and so on The similar mechanism, You clean c5 and so on a3 b6 d10 () a3 (b6) d10 (a3) b6 () a7 a3 b6 d15 a3 b11 d15 a7 b11 d15 (b11) d15 (a7) b11 WD S1 S3 S2 b aca… b a c b a d f c d 1 11 3 5 6 7 10 12 13 15 Timestamp 16 17 18…

  15. Outline • Sequence Scan • Negation • Window in Sequence Scan • Problem Analysis • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  16. Problem with OOO:Sequence Scan SS Missing Result EVENT SEQ(A, B, D) WITHIN 10 Seconds a d b aca… b a c b a d f c d 0 2 1 11 3 5 6 7 10 12 13 15 16 17 18… Arrival Order * * Produced Result Correct Result A B D 0 1 2 3 a3 b6 d10 a7 b11 d15 a0 b1 d2 a3 b6 d10 a7 b11 d15 Missing! () a3 (b6) d10 (a3) b6 () a7 (b11) d15 (a7) b11

  17. Problem with OOO:Negation NG Incorrect Result EVENT SEQ(A, B, !C, D) WITHIN 10 Seconds b a c b a d c 1 9 3 5 6 7 10 Arrival Order * * Produced Result A B D 0 1 2 3 a3 b6 d10 a7 b11 d15 () a3 (b6) d10 Incorrect! (a3) b6

  18. Problem with OOO:Purge in SSC Purge in SS You see d15 then purge a3 and so on After that, OOO d8 comes  Missing Result! Similar case of purging then making incorrect result!  (1) You cannot purge any data if you want to avoid missing results or creating spurious result (2) Unbounded buffer requirement in that case EVENT SEQ(A, B, D) WITHIN 10 Seconds * * A B D 0 1 2 3 a3 b6 d10 () a3 (b6) d10 (a3) b6 () a7 a3 b6 d15 a3 b11 d15 a7 b11 d15 (b11) d15 (a7) b11 S1 S3 S2 b aca… b a c b a d f c d d 1 11 3 5 6 7 8 10 12 13 15 Timestamp 16 17 18… If precise query result is required, and memory resources is limited, WD in SS would not be sufficient for handling Out-of-order event arrival!

  19. Problem with OOO:Purge in NG

  20. Outline • Sequence Scan • Window in Sequence Scan • Negation (skipped) • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  21. Solution in SS:Using Sort Semantic • Initially, every stack is active • Search for proper place in the stack for a new event • RIP pointer might be reset EVENT SEQ(A, B, D) WITHIN 10 Seconds (both coming after d15) a d b aca… b a c b a d f c d 0 2 1 11 3 5 6 7 10 12 13 15 16 17 18… Arrival Order Inserting the a0 / d2 (OOO) into the right spot and reset RIP * * A B D 0 1 2 3 a3 () a3 () a3 () a3 (a3) b1 ( ) b1 (a3) b1 () a7 () a7 (b6) d10 (b6) d10 (a3) b6 (a3) b6 (b11) d15 (b11) d15 (a7) b11 (a7) b11

  22. Solution in NG / PSSC / PNG:Possible Solutions • Using K-Slack • Pros: simple • Cons: big assumption about the input stream • Punctuation • Pros: general and more optimization opportunities • Cons: might have overhead

  23. Solution in NG / PSSC / PNG by K-Slack

  24. Solution in NG / PSSC / PNG by Punctuation

  25. Punctuation: Range-Out-of-Order-Free Punctuation • Range-Out-of-Order-Free (Roof) Punctuation P<E, t> • time_stamp t • Event type E • Property • Total Order in-order events (simply we can just use the timestamp and don’t care the received time) • No contradiction within the punctuations: getting stronger and stronger • Example No More out-dated D events will come v v a a d d Time Stamp D_p

  26. Punct in PSSC:Back2Front Singleton Purge distance (a, D_p) > w?? (D_p.Timestam - a.Timestamp > w) EVENT SEQ(A, B, D, E) WITHIN 10 Seconds w a d D_p w If Yes, any d’s inside back window?

  27. Punct in PSSC:Front2Back Singleton Purge EVENT SEQ(A, B, D, E) WITHIN 10 Seconds If Yes, any a’s inside the front window? w a d A_p d appears infront of A_p? (d.Timestam < A_p.Timestamp ?)

  28. Punct in PSSC:Lazy Purge Algorithm EVENT SEQ (E1, E2, …, En) WITHIN 10 Seconds e e e e e P P P P Purging event sequence • Algorithm: Lazy Purging • Receiving event e or roof punctuation rp: • Event e: updating the stored event sequence and periodically doing • ALG purging_event_seq (ROOF_Set, stored event sequence) • (2) ROOF rp: updating the ROOF_Set and periodically doing • ALG purging_event_seq (ROOF_Set, stored event sequence)

  29. Punct in PSSC:Lazy Purge Algorithm (Cont.) EVENT SEQ (E1, E2, …, En) WITHIN 10 Seconds e P P P P Algorithm: purging_single_event A single event e and a ROOF_Set: ALG purging_single_event (ROOF_Set, stored event sequence) // sequential checking + dependency checking

  30. Punct in PSSC:Lazy Purge Algorithm (Cont.) EVENT SEQ (E1, E2, …, En) WITHIN 10 Seconds e e e e e P P P P Algorithm: purging_event_sequence Event sequence and roof punctuation rp: ALG purging_event_seq (ROOF_Set, stored event sequence) // by the event order, do purging_single_seq

  31. Punct in PSSC:Aggressive Purge Algorithm EVENT SEQ (E1, E2, …, En) WITHIN 10 Seconds e e e e e P P P P Can drop event directly • Algorithm: Aggressive Purging • Receiving event e or roof punctuation rp: • Event e: updating the stored event sequence and periodically doing • ALG purging_single_event (ROOF_Set, stored event sequence) • (2) ROOF rp: updating the ROOF_Set and periodically doing • ALG purging_signle_event (ROOF_Set, stored event sequence) Purging old sequence

  32. Punct in PSSC:Optimization (Cont.) • Keeping the purging complete, but smarter • Under construction • Making the purging “incomplete” • Singleton purging • Total purging • Density-based purging

  33. Punct in PSSC:Optimization Singleton Batch Purging 1 Every A and B event falling in this range can be purged EVENT SEQ(A, B, D, E) WITHIN 10 Seconds w d D_p Furriest D event outside the window

  34. Punct in PSSC:Optimization Singleton Batch Purging 2 Every D and E event falling in this range can be purged EVENT SEQ(A, B, D, E) WITHIN 10 Seconds b B_p Furriest B event outside the window

  35. Outline • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work • Sequence Scan • Window in Sequence Scan

  36. Implementation:Design • Basic Event Processing • Event and event generator, query plan and plan generator, basic operators • Out-of-order Handler • New functionalities of SS: two modes – append and sort (for further possible chance of using punctuation) • New functionalities of the NG and WD operator

  37. Implementation:Design (Cont.) Query Plan Generator Query Query Plan Window tuples SSC NFA Event stream generator Stack maintain state and pointers

  38. Outline • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work • Design and Setup • Result Analysis

  39. Experiment: Design and Setup

  40. Experiment: Result Analysis

  41. Outline • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  42. Conclusion • We study the problem with OOO event arrival • We propose a solution framework on handling sequence query processing with out-of-order data arrival

  43. Outline • Introduction • Preliminary • Problem with Out-of-Order Event Arrival • Solution • Implementation • Experiment • Conclusion • Related Work

  44. Related Work • Event stream process (SASE system) • Regular stream processing system (TelegraphCQ, Eddy, etc. ) • Basic event processing (Amit system) • Luping Ding’s comprehensive exam talk • K-slack and punctuation

  45. Happy Holiday Season!

  46. TF: sequence to composite event NG:!C(B.time<C.time<D.time) ( ts:timestamp ) WD: D.ts – A.ts < 10 secs SC: (A, B, D) SSC SS: (A, B, D) PSSC: W = 10 secs Input Event Stream Q: EVENT SEQ (A, B,!C, D) WITHIN 10 seconds

  47. * * [b6] d10 [ ] a3 [a3] b6 B D A 0 1 2 3 [b11] d15 [ ] a7 [a7] b11 S1 S2 S3 (a) SSC using Active Instance Stacks (b) SSC using Active Instance Stacks b a c b a d f c d f f b … Receiving Order 11 3 5 6 7 10 12 13 15 1 17 16 … (c) Input Event Stream SSC TF WD Input Event Stream <a3 b6 d10> <a7 b11 d15> a3 b6 d10 a3 b6 d15 a3 b11 d15 a7 b11 d15 a3 b6 d10 a7 b11 d15 Tuples Holding Event Sequences (d) Producing Result Tuples

  48. NG produces spurious results 1 Output spurious results TF: sequence to composite event NG: !C(B.time<C.time<D.time) 2 PNG: window W Unauthorized Negation Buffer Purge PNG mistakenly purges events from the Negation Buffer (events might be used to form out-of-order sequences in the future) WD: Em.ts – E1.ts < W SC: (E1,E2,…,Em) SSC PSSC: window W Active Instance Stacks (AIS) 3 Unauthorized AIS Purge PSSC mistakenly purges events from the AIS (events might be used to form out-of-order sequences in the future) SS: (E1,E2,…,Em) Input Event Stream

  49. [] a3 [a3] b6 [b6] d10 [] a3 [a3] b6 [b6] d10 [] a7 [a7] b11 [b11] d15 [] a7 [a7] b11 [b11] d15 [b11] d8 [a7] b8 S1 S1 S2 S3 S2 S3 (a) Incorrect AIS Appending Example1 (b) Incorrect AIS Appending Example2

  50. b a c b a d f c d f c f b 11 3 5 6 7 10 12 13 15 1 17 9 16 Received Order (a) Out-of-Order Event Arrival Example 1 b a c b a d f c d a f b f d 11 3 5 6 7 10 12 13 15 1 0 16 17 2 Received Order (b) Out-of-Order Event Arrival Example 2 b f f a c b a d f c d d (or b) b 11 3 5 6 7 10 12 13 15 1 8 17 16 Received Order (c) Out-of-Order Event Arrival Example 3

More Related