1 / 33

The Design of the Borealis Stream Processing Engine

The Design of the Borealis Stream Processing Engine. CIDR 2005 Brandeis University, Brown University, MIT. Kang, Seungwoo 2005.3.15. Ref. http://www-db.cs.wisc.edu/cidr/presentations/26 Borealis.ppt. One-line comment.

kamal
Download Presentation

The Design of the Borealis Stream Processing Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo 2005.3.15 Ref. http://www-db.cs.wisc.edu/cidr/presentations/26 Borealis.ppt

  2. One-line comment • This paper presents an overview of the design for Borealis distributed stream processing engine with new features and mechanisms

  3. Outline • Motivation and Goal • Requirements • Technical issues • Dynamic revision of query results • Dynamic query modification • Flexible and scalable optimization • Fault tolerance • Conclusion

  4. Motivation and Goal • Envision the second-generation SPE • Fundamental requirements of many streaming applications not supported by first-generation SPE • Three fundamental requirements • Dynamic revision of query results • Dynamic query modification • Flexible and scalable optimization • Present the functionality and preliminary design of Borealis

  5. Requirements • Dynamic revision of query results • Dynamic query modification • Flexible and highly scalable optimization

  6. Changes in the models • Data model • Revision support • Three types of messages • Query model • Support revision processing • Time travel, CP views • Support modification of Op box semantic • Control lines • QoS model • Introduce QoS metrics in tuple basis • Each tuple (message) holds VM (Vector of Metrics) • Score function

  7. Architecture of Borealis Modify and extend both systems, Aurora and Medusa, with a set of features and mechanisms

  8. Dynamic revision of query results • Problem • Corrections of previously reported data from data source • Highly volatile and unpredictable characteristics of data sources like sensors • Late arrival, out-of-order data • Just accept approximate or imperfect results • Goal • Support to process revisions and correct the previous output result

  9. (3pm,$11) (2pm,$12) (2:25,$9) (4:15,$10) (4:05,$11) (1pm,$10) Average price 1 hour Causes for Tuple Revisions • Data sources revise input streams: “On occasion, data feeds put out a faulty price [...] and send a correction within a few hours” [MarketBrowser] • Temporary overloads cause tuple drops • Data arrives late, and miss its processing wnd

  10. New Data Model for Revisions header data • time: tuple timestamp • type: tuple type • insertion, deletion, replacement • id: unique identifier of tuple on stream (time,type,id,a1,...,an)

  11. (3pm,$11) (2pm,$11) (2pm,$12) (3pm,$11) (2:25,$9) (1pm,$10) (2pm,$12) Average price Average price 1 hour 1 hour Revision Processing in Borealis • Closed model: revisions produce revisions

  12. Revision Processing in Borealis • Connection points (CPs) store history (diagram history with history bound) • Operators pull the history they need Input queue Operator Stream S CP Most recent tuple in history Oldest tuple in history Revision

  13. Discussion • Runtime overhead • Assumption • Input revision msg < 1% • Revision msg processing can be deferred • Processing cost • Storage cost • Application’s needs / requirements • Whether it is interested in revision msg or not • What to do with revision msg • Rollback, compensation ?? • Application’s QoS requirement • Delay bound

  14. Dynamic query modification • Problem • Change certain attributes of the query at runtime • E.g. network monitoring • Manual query substitutions – high overhead, slow to take effect (starting with an empty state) • Goal • Low overhead, fast, and automatic modifications • Dynamic query modification • Control lines • Provided to each operator box • Carry messages with revised box parameters (window size, filter predicate) and new box functions • Timing problem • Specify precisely what data is to be processed according to what control parameters • Data is ready for processing too late or too early • old data vs. new parameter -> buffered old parameters • new data vs. old parameter -> revision message and time travel

  15. Time travel • Motivation • Rewind history and then repeat it • Move forward into the future • Connection Point (CP) • CP View • Independent view of CP • View range • Start_time • Max_time • Operations • Undo: deletion message to roll back the state to time t • Replay • Prediction function for future travel 1 2 3 4 5 6 2 3 4 3 4 5 6

  16. Discussion • How to determine whether timing problem occurs or not • Too early: Lookup data buffered in CP • Too late: Checking timestamp of the tuple • Checking overhead • “Performed on a copy of some portion of the running query diagram, so as not to interfere with processing of the running query diagram” • Who makes a copy, when, how to make? How to manage? • Future time travel • When is it useful? • E.g alarming ?

  17. Optimization in a Distributed SPE • Goal: Optimized resource allocation • Challenges: • Wide variation in resources • High-end servers vs. tiny sensors • Multiple resources involved • CPU, memory, I/O, bandwidth, power • Dynamic environment • Changing input load and resource availability • Scalability • Query network size, number of nodes

  18. Quality of Service • A mechanism to drive resource allocation • Aurora model • QoS functions at query end-points • Problem: need to infer QoS at upstream nodes • An alternative model • Vector of Metrics (VM) carried in tuples • Content: tuple’s importance, or its age • Performance: arrival time, total resource consumed, .. • Operators can change VM • A Score Function to rank tuples based on VM • Optimizers can keep and use statistics on VM

  19. State Confidence Area Thermal Hydration Cognitive Life Signs Wound Detection 90% 60% 100% 90% 80% Physiologic Models S S S Example Application: Warfighter Physiologic Status Monitoring (WPSM)

  20. VM ([age, confidence], value) HRate Model1 RRate Merge Model2 Temp Sensors Models may change confidence. SF(VM) = VM.confidence x ADF(VM.age) Score Function age decay function

  21. Borealis Optimizer Hierarchy End-point Monitor(s) Global Optimizer Neighborhood Optimizer Local Monitor Local Optimizer Borealis Node statistics trigger decision

  22. Optimization Tactics • Priority scheduling - Local • Modification of query plans - Local • Changing order of commuting operators • Using alternate operator implementations • Allocation of query fragments to nodes - Neighborhood, Global • Load shedding - Local, Neighborhood, Global

  23. Priority scheduling • Highest QoS Gradient box • Evaluate the predicted-QoS score function on each message with values in VM • average QoS Gradient for each box • By comparing average QoS-impact scores between the inputs and outputs of each box • Highest QoS-impact message from the input queue • Out-of-order processing • Inherent load shedding behavior

  24. A C B D S1 S2 A B S1 Connected Plan Cut Plan C D r r S2 2cr 2r 2r 4cr 3cr 3cr Correlation-based Load Distribution • Under the situation that network BW is abundant and network transfer delays are negligible • Goal: Minimize end-to-end latency • Key ideas: • Balance load across nodes to avoid overload • Group boxes with small load correlation together • Maximize load correlation among nodes

  25. Load Shedding • Goal: Remove excess load at all nodes and links • Shedding at node A relieves its descendants • Distributed load shedding • Neighbors exchange load statistics • Parent nodes shed load on behalf of children • Uniform treatment of CPU and bandwidth problem • Load balancing or Load shedding?

  26. Node A Node B 4C C r1 r2 App1 3C C r1 r2 App2 Local Load Shedding Goal: Minimize total loss r2’ 25% 58% CPU Load = 5Cr1, Cap = 4Cr1 CPU Load = 4Cr2, Cap = 2Cr2 CPU Load = 3.75Cr2, Cap = 2Cr2 No overload No overload A must shed 20% B must shed 50% A must shed 20% A must shed 20% B must shed 47%

  27. Node A Node B 4C C r1 r2 App1 3C C r1 r2 App2 Distributed Load Shedding Goal: Minimize total loss r2’ 9% r2’’ 64% CPU Load = 5Cr1, Cap = 4Cr1 CPU Load = 4Cr2, Cap = 2Cr2 No overload No overload A must shed 20% B must shed 50% A must shed 20% A must shed 20% smaller total loss!

  28. data Proxy control Sensors control message from the optimizer Extending Optimization to Sensor Nets • Sensor proxy as an interface • Moving operators in and out of the sensor net • Adjusting sensor sampling rates

  29. Discussion • QoS-based optimization • QoS score calculations, statistics maintaining • tuple basis • Fine-grained but lots of processing • Application’s burden • QoS specifications • How to facilitate the task?

  30. Node 3 U m U S m Node 3’ Node 1’ Node 2’ U m s3’ Fault-Tolerance through Replication • Goal: Tolerate node and network failures s3 Node 1 Node 2 U S m s1 s2 s4

  31. Fault-Tolerance Approach • If an input stream fails, find another replica • No replica available, produce tentative tuples • Correct tentative results after failures STABLE UPSTREAM FAILURE Missing or tentative inputs Failure heals Another upstream failure in progress Reconcile state Corrected output STABILIZING

  32. Conclusions • Next generation streaming applications require a flexible processing model • Distributed operation • Dynamic result and query modification • Dynamic and scalable optimization • Server and sensor network integration • Tolerance to node and network failures • http://nms.lcs.mit.edu/projects/borealis

  33. Discussion • Distributed environment • Load management • Fault tolerance • How about data routing / forwarding issue ?? • Other problems not tackled?

More Related