Operator Scheduling in a Data Stream Manager

Operator Scheduling in a Data Stream Manager D. Charney, U.Çetintemel, A.Rasin, S.Zdonik, - Brown University M.Cherniack - Brandeis University M.Stonebraker - MIT Proceedings of the 29th VLDB Conference, Berlin, Germany Presenter: Sriram Krishnan Date: 3/30/05

Agenda • Aurora DSMS Architecture • Scheduling Algorithms • Tuple Batching • Experimental Evaluation • QoS Aware Scheduling • Conclusion WPI – CS 561

Overview of Stream Processing • Many applications / devices create data streams • Examples: sensor networks, position tracking, network management, Health monitor, etc. • These applications require timely processing of large number of continuous, potentially rapid and asynchronous data streams. WPI – CS 561

Aurora data stream manager • Addresses the performance and processing requirements of stream-based applications. • Supports multiple concurrent continuous queries on one or more application data streams • continuous query consists of a directed acyclic graph of a well-defined set of operators (boxes in Aurora) • Applications define their service expectations using Quality-of-Service (QoS) specifications WPI – CS 561

Operator Scheduler • A key component of any data stream management system. • Multiplexes processor usage to multiple continuous queries according to application specified QoS. • Simple processor allocation can be achieved by assigning a thread per operator. • Not good (why?) WPI – CS 561

Paper overview • This paper shows that having finer-grain control of processor allocation can make a significant difference to overall system performance. • The paper describes the design and implementation of the Aurora scheduler. WPI – CS 561

Motivation: Cost components of continuous query • Random and Round robin scheduling. • Inference? • The actual time spent for processing is smaller than 5% of the overall execution time in both cases. WPI – CS 561

Aurora scheduler • Performs the following tasks • Constructs a Dynamic scheduling-plan that specifies, • Which boxes to schedule • In which order to schedule the boxes • How many tuples to process at each box execution. • Schedules based on the QoS: • Strives to maximize the overall QoS delivered to the client applications WPI – CS 561

Aurora System Model (High Level) • Fundamentally a data-flow system. • Tuples flow through a loop-free, directed graph of processing operations (a.k.a. boxes). WPI – CS 561

Aurora System Model • Tuples generated by data sources arrive at the input and are queued for processing. • The scheduler selects boxes with waiting tuples and executes them on one or more of their input tuples. • The output tuples of a box are queued at the input of the next box in sequence. • The QoS is specified primarily based on the notion of the latency (i.e., delay) of output tuples • Output tuples should be produced in a timely fashion, otherwise, QoS will degrade as latencies get longer. WPI – CS 561

Aurora Architecture • Conceptually the Scheduler • Picks a box for execution. • Ascertains how many tuples to process from its input. • Passes the information to the multi-threaded box processor. The box processor executes the appropriate operation and then forwards the output tuples to the router. Question: Why should we monitor QoS? WPI – CS 561

Execution Model • Thread-based execution • Each operator/query is processed in its own thread • The operating system manages resource allocation • Advantages • Easy to program • Efficient operating system algorithms • Disadvantages • Overhead due to cache misses, lock contention and context switching. • Software has limited control of resource management. WPI – CS 561

Aurora - Execution Model • Aurora uses a state-based scheduling execution model. • There is a single scheduler thread that tracks system state and maintains the execution queue. • The execution queue is shared among a small number of worker threads • This model • Enables fine grained allocation of resources according to application specifications • Enables effective batching of operators and tuples (Why is this not possible with Thread based?). WPI – CS 561

Execution Model - Comparison • As system workload increases, Performance degrades almost linearly in Aurora and exponentially in thread-per-box. • What Does it mean? WPI – CS 561

Two-Level Scheduling • First level involves which continuous (sub-)query to process. • Used for dynamically assigning priorities to operators • Second level involves how precisely the selected query should be processed. • Used for choosing the order in which the component operators will be executed. • Outcome of above decisions are a sequence of operators, referred to as a scheduling plan. WPI – CS 561

Sample Query Tree • The tree is rooted at box b1 (Aurora constraint) • We will refer to this tree in subsequent slides WPI – CS 561

Superbox - Operator Batching • A tree of boxes rooted at an output box • Sequence of boxes that is scheduled as an atomic group. • Superboxes decrease the overall execution costs and improve scalability • They reduce the scheduling overhead by scheduling multiple boxes as a single unit • They eliminate the need to access the storage manager for each individual box execution. WPI – CS 561

Scheduling • First-level scheduling - Superbox selection • Static and dynamic scheduling approaches • Static approaches to scheduling are defined prior to runtime. • Aurora implements a static superbox selection - application-at-a-time – one superbox per query. • Dynamic approaches use runtime information and statistics to adjust and prioritize scheduling order. • Second-level scheduling – Superbox traversal • Specifies the ordering of the boxes in the scheduling plan. • Accomplished by traversing the superbox. WPI – CS 561

Superbox Traversal • Superbox traversal refers to how the operators within a superbox should be executed • Three traversal Algorithms: • Min-Cost (MC) • Min-Latency (ML) • Min-Memory (MM) WPI – CS 561

Superbox Traversal – Min Cost • Min-Cost (MC) – Attempts to optimize throughput by minimizing the number of box calls per output tuple. • Accomplished by traversing the superbox in post order. • A box is scheduled for execution only after all the boxes in its sub-tree are scheduled. WPI – CS 561

Superbox Traversal – Min Cost (Contd.) • Assume each box has • A Processing cost per tuple of p • A Box call overhead of o • A selectivity equal to one (what is this?) • Exactly one non-empty input queue that contains a single tuple. • MC traversal executes each box only once: • In which order the boxes are traversed? • b4  b5  b3  b2  b6  b1 • Execution cost - 15p + 6o (why?) • Average output tuple latency is - 12.5p + 6o WPI – CS 561

Superbox Traversal – Min Latency • Min-Latency (ML) – Average latency of the output tuples can be reduced by producing initial output tuples as fast as possible. • Defines a value called output cost for each box. • An estimate of the latency incurred in producing one output tuple. • Output Selectivity • How many tuples must be processed from the input to produce 1 tuple at the output. • Product of selectivity of all boxes downstream, including the current box. • Relation between output selectivity and Output cost? • Approximately inversely proportional (depends on the cost of boxes involved.) WPI – CS 561

Superbox Traversal – Min Latency • Traversal? • b1  b2  b1  b6  b1  b4  b2  b1  b3  b2  b1  b5  b3  b2  b1 • The ML traversal incurs nine extra box calls over an MC traversal • Note: MC incurred six box calls. • Total execution cost is 15p + 15o • Which one has lower execution time – ML or MC? • MC “always” has a lower execution time. WPI – CS 561

Superbox Traversal – Min Memory • Min-Memory (MM) – Attempts to minimize memory usage • Schedules boxes in an order that yields maximum increase in available memory • Defines Expected memory reduction rate for each box. • EMRR = function (current queue size, Selectivity, Cost) WPI – CS 561

Superbox Traversal – Min Memory • Assume following box selectivity and cost • b1 = (0.9, 2) b2 = (0.4, 2) b3 = (0.5, 1) b4 = (1.0, 2) b5 = (0.4, 3) b6 = (0.6, 1) • Assuming initial queue size of 1 • Computed EMRR for the boxes are • b1=0.05, b2=0.3, b3=0.5, b4=0, b5=0.2, b6=0.4 • What will be the Scheduling Plan? • b3  b6  b2  b5  b3  b2  b1  b4  b2  b1 WPI – CS 561

Tuple Batching (Train Processing) • A Tuple Train is the process of executing tuples in a batch within a single operator call. • The goal of Tuple Train processing is to reduce overall processing cost. How? • Decreased number of total box calls. • Cuts down on low level overhead such as context switching, scheduling, and execution queue maintenance • Improves memory utilization (low memory) • Reduces the tuple from shuttling back and forth between memory and disk. • Some operators execute faster with larger number of tuples available in their queues. WPI – CS 561

Tuple Batching • The Aurora scheduler implements train processing by telling each box when to execute and how many queued tuples to process. • Aurora allows an arbitrary number of tuples to be contained within a train. • What variables dictate the size of a train? • Variance in latencies • Total memory footprint WPI – CS 561

Operator Batching: Evaluation • RR_BAAT - Round Robin - Box At A Time. • MC_AAAT – Minimum Cost - Application At A Time. • What can we infer? • The scheduling overhead of the box-at-a-time approach is very evident. Capacity: Percent of system resources used. WPI – CS 561

Latency: Min-cost Vs Min-Latency • What can we Infer? • For larger processing costs, ML wins as it optimizes the traversal by minimizing output latency. • For smaller box processing costs, box call overheads dominate overall costs and MC wins. WPI – CS 561

Memory requirements: Evaluation • Inference? • ML is most inefficient in its use of memory with MC performing second. • Crossover towards the end of the time period is a consequence of the fact that different traversals take different times to finish. The curves are normalized with respect to the MM values. WPI – CS 561

Tuple Batching - Evaluation • Inference? • For a burst size of 4, the overhead quadruples. • When the train size is equal to one (the entire queue), the average overhead approaches the overhead for the non bursty case. Train size (x-axis) is given as a percentage of the queue size. Overhead: Total execution time less processing time. In order to isolate the effects of operator scheduling, round-robin BAAT was used for this experiment. WPI – CS 561

Comparison of Execution times • - TAAT (tuple-at-a-time) • - BAAT (tuple trains) • MC (Superbox) • Number at the top shows actual time for processing 100k tuples in the system. • TAAT is significantly worse than the other methods. • Superbox scheduling decreases the overall execution time of the system running tuple-trains almost by 50% • As we go from left to right, the scheduler algorithms become increasingly more intelligent and sophisticated, taking more time to generate the scheduling plans. WPI – CS 561

QoS-Driven Scheduling • Keep track of the latency of tuples that reside at the queues. • Pick the tuples whose execution will provide the most expected increase in the aggregate QoS. • This approach is not scalable (Why?) • Tuple batching will be difficult • High scheduling overhead. • Aurora Scheduler maintains latency information at the granularity of individual boxes • Latency of a box is the averaged latencies of the tuples in its input queue. WPI – CS 561

QoS-Driven Scheduling - Algorithm Expected output latency: Eol(b) = latency(b) + cost(D(b)) Utility: utility(b) = gradient(eol(b)) Expected slack time:est(b), is an indication of how close a box is to a critical point. Critical point: A point where the QoS changes sharply. • Priority is assigned so as to order the boxes in terms of their utility and urgency. • Utility - This value is an estimation of where a box’s tuples currently are on the QoS-latency curve at the corresponding output. When is the Utility lowest and highest? • Urgency – Given by the expected slack time. WPI – CS 561

QoS-Driven Scheduling • Scheduler algorithm: • First choose for execution those boxes that have the highest utility, • Choose from among those that have the same utility, the ones that have the minimum slack time. WPI – CS 561

QoS-Driven Scheduling - Evaluation • Inference? WPI – CS 561

CONCLUSION • Presents an experimental investigation of scheduling algorithms for stream data management systems. • The authors • showed that a naïve scheduling approach of using a thread per box does not scale. • showed that the approach of train scheduling and superbox scheduling help a lot to reduce system overheads. • addressed QoS issues and extended their basic algorithms to address application-specific QoS expectations. WPI – CS 561

Operator Scheduling in a Data Stream Manager

Operator Scheduling in a Data Stream Manager

Presentation Transcript

RESOURCE MANAGER: A SUCCESFUL SCHEDULING PROGRAM

Data Stream Mining

Data Stream Processor

Arpita Data Entry Operator

Stream Data

Data Stream Clustering

Data Type and Operator

Data Stream Protocol

Data Manager

Reducing Execution Overhead in a Data Stream Manager

Data Stream Management

Load Shedding in a Data Stream Manager

Data Stream Computation

Analysis of : Operator Scheduling in a Data Stream Manager

Data Stream Mining

Load Shedding in a Data Stream Manager

Phased Scheduling of Stream Programs

Data Stream Mining

Reducing Execution Overhead in a Data Stream Manager