380 likes | 407 Views
Operator Scheduling in a Data Stream Manager. D. Charney, U.Çetintemel, A.Rasin, S.Zdonik, - Brown University M.Cherniack - Brandeis University M.Stonebraker - MIT Proceedings of the 29 th VLDB Conference, Berlin, Germany. Presenter: Sriram Krishnan Date: 3/30/05. Agenda.
E N D
Operator Scheduling in a Data Stream Manager D. Charney, U.Çetintemel, A.Rasin, S.Zdonik, - Brown University M.Cherniack - Brandeis University M.Stonebraker - MIT Proceedings of the 29th VLDB Conference, Berlin, Germany Presenter: Sriram Krishnan Date: 3/30/05
Agenda • Aurora DSMS Architecture • Scheduling Algorithms • Tuple Batching • Experimental Evaluation • QoS Aware Scheduling • Conclusion WPI – CS 561
Overview of Stream Processing • Many applications / devices create data streams • Examples: sensor networks, position tracking, network management, Health monitor, etc. • These applications require timely processing of large number of continuous, potentially rapid and asynchronous data streams. WPI – CS 561
Aurora data stream manager • Addresses the performance and processing requirements of stream-based applications. • Supports multiple concurrent continuous queries on one or more application data streams • continuous query consists of a directed acyclic graph of a well-defined set of operators (boxes in Aurora) • Applications define their service expectations using Quality-of-Service (QoS) specifications WPI – CS 561
Operator Scheduler • A key component of any data stream management system. • Multiplexes processor usage to multiple continuous queries according to application specified QoS. • Simple processor allocation can be achieved by assigning a thread per operator. • Not good (why?) WPI – CS 561
Paper overview • This paper shows that having finer-grain control of processor allocation can make a significant difference to overall system performance. • The paper describes the design and implementation of the Aurora scheduler. WPI – CS 561
Motivation: Cost components of continuous query • Random and Round robin scheduling. • Inference? • The actual time spent for processing is smaller than 5% of the overall execution time in both cases. WPI – CS 561
Aurora scheduler • Performs the following tasks • Constructs a Dynamic scheduling-plan that specifies, • Which boxes to schedule • In which order to schedule the boxes • How many tuples to process at each box execution. • Schedules based on the QoS: • Strives to maximize the overall QoS delivered to the client applications WPI – CS 561
Aurora System Model (High Level) • Fundamentally a data-flow system. • Tuples flow through a loop-free, directed graph of processing operations (a.k.a. boxes). WPI – CS 561
Aurora System Model • Tuples generated by data sources arrive at the input and are queued for processing. • The scheduler selects boxes with waiting tuples and executes them on one or more of their input tuples. • The output tuples of a box are queued at the input of the next box in sequence. • The QoS is specified primarily based on the notion of the latency (i.e., delay) of output tuples • Output tuples should be produced in a timely fashion, otherwise, QoS will degrade as latencies get longer. WPI – CS 561
Aurora Architecture • Conceptually the Scheduler • Picks a box for execution. • Ascertains how many tuples to process from its input. • Passes the information to the multi-threaded box processor. The box processor executes the appropriate operation and then forwards the output tuples to the router. Question: Why should we monitor QoS? WPI – CS 561
Execution Model • Thread-based execution • Each operator/query is processed in its own thread • The operating system manages resource allocation • Advantages • Easy to program • Efficient operating system algorithms • Disadvantages • Overhead due to cache misses, lock contention and context switching. • Software has limited control of resource management. WPI – CS 561
Aurora - Execution Model • Aurora uses a state-based scheduling execution model. • There is a single scheduler thread that tracks system state and maintains the execution queue. • The execution queue is shared among a small number of worker threads • This model • Enables fine grained allocation of resources according to application specifications • Enables effective batching of operators and tuples (Why is this not possible with Thread based?). WPI – CS 561
Execution Model - Comparison • As system workload increases, Performance degrades almost linearly in Aurora and exponentially in thread-per-box. • What Does it mean? WPI – CS 561
Two-Level Scheduling • First level involves which continuous (sub-)query to process. • Used for dynamically assigning priorities to operators • Second level involves how precisely the selected query should be processed. • Used for choosing the order in which the component operators will be executed. • Outcome of above decisions are a sequence of operators, referred to as a scheduling plan. WPI – CS 561
Sample Query Tree • The tree is rooted at box b1 (Aurora constraint) • We will refer to this tree in subsequent slides WPI – CS 561
Superbox - Operator Batching • A tree of boxes rooted at an output box • Sequence of boxes that is scheduled as an atomic group. • Superboxes decrease the overall execution costs and improve scalability • They reduce the scheduling overhead by scheduling multiple boxes as a single unit • They eliminate the need to access the storage manager for each individual box execution. WPI – CS 561
Scheduling • First-level scheduling - Superbox selection • Static and dynamic scheduling approaches • Static approaches to scheduling are defined prior to runtime. • Aurora implements a static superbox selection - application-at-a-time – one superbox per query. • Dynamic approaches use runtime information and statistics to adjust and prioritize scheduling order. • Second-level scheduling – Superbox traversal • Specifies the ordering of the boxes in the scheduling plan. • Accomplished by traversing the superbox. WPI – CS 561
Superbox Traversal • Superbox traversal refers to how the operators within a superbox should be executed • Three traversal Algorithms: • Min-Cost (MC) • Min-Latency (ML) • Min-Memory (MM) WPI – CS 561
Superbox Traversal – Min Cost • Min-Cost (MC) – Attempts to optimize throughput by minimizing the number of box calls per output tuple. • Accomplished by traversing the superbox in post order. • A box is scheduled for execution only after all the boxes in its sub-tree are scheduled. WPI – CS 561
Superbox Traversal – Min Cost (Contd.) • Assume each box has • A Processing cost per tuple of p • A Box call overhead of o • A selectivity equal to one (what is this?) • Exactly one non-empty input queue that contains a single tuple. • MC traversal executes each box only once: • In which order the boxes are traversed? • b4 b5 b3 b2 b6 b1 • Execution cost - 15p + 6o (why?) • Average output tuple latency is - 12.5p + 6o WPI – CS 561
Superbox Traversal – Min Latency • Min-Latency (ML) – Average latency of the output tuples can be reduced by producing initial output tuples as fast as possible. • Defines a value called output cost for each box. • An estimate of the latency incurred in producing one output tuple. • Output Selectivity • How many tuples must be processed from the input to produce 1 tuple at the output. • Product of selectivity of all boxes downstream, including the current box. • Relation between output selectivity and Output cost? • Approximately inversely proportional (depends on the cost of boxes involved.) WPI – CS 561
Superbox Traversal – Min Latency • Traversal? • b1 b2 b1 b6 b1 b4 b2 b1 b3 b2 b1 b5 b3 b2 b1 • The ML traversal incurs nine extra box calls over an MC traversal • Note: MC incurred six box calls. • Total execution cost is 15p + 15o • Which one has lower execution time – ML or MC? • MC “always” has a lower execution time. WPI – CS 561
Superbox Traversal – Min Memory • Min-Memory (MM) – Attempts to minimize memory usage • Schedules boxes in an order that yields maximum increase in available memory • Defines Expected memory reduction rate for each box. • EMRR = function (current queue size, Selectivity, Cost) WPI – CS 561
Superbox Traversal – Min Memory • Assume following box selectivity and cost • b1 = (0.9, 2) b2 = (0.4, 2) b3 = (0.5, 1) b4 = (1.0, 2) b5 = (0.4, 3) b6 = (0.6, 1) • Assuming initial queue size of 1 • Computed EMRR for the boxes are • b1=0.05, b2=0.3, b3=0.5, b4=0, b5=0.2, b6=0.4 • What will be the Scheduling Plan? • b3 b6 b2 b5 b3 b2 b1 b4 b2 b1 WPI – CS 561
Tuple Batching (Train Processing) • A Tuple Train is the process of executing tuples in a batch within a single operator call. • The goal of Tuple Train processing is to reduce overall processing cost. How? • Decreased number of total box calls. • Cuts down on low level overhead such as context switching, scheduling, and execution queue maintenance • Improves memory utilization (low memory) • Reduces the tuple from shuttling back and forth between memory and disk. • Some operators execute faster with larger number of tuples available in their queues. WPI – CS 561
Tuple Batching • The Aurora scheduler implements train processing by telling each box when to execute and how many queued tuples to process. • Aurora allows an arbitrary number of tuples to be contained within a train. • What variables dictate the size of a train? • Variance in latencies • Total memory footprint WPI – CS 561
Operator Batching: Evaluation • RR_BAAT - Round Robin - Box At A Time. • MC_AAAT – Minimum Cost - Application At A Time. • What can we infer? • The scheduling overhead of the box-at-a-time approach is very evident. Capacity: Percent of system resources used. WPI – CS 561
Latency: Min-cost Vs Min-Latency • What can we Infer? • For larger processing costs, ML wins as it optimizes the traversal by minimizing output latency. • For smaller box processing costs, box call overheads dominate overall costs and MC wins. WPI – CS 561
Memory requirements: Evaluation • Inference? • ML is most inefficient in its use of memory with MC performing second. • Crossover towards the end of the time period is a consequence of the fact that different traversals take different times to finish. The curves are normalized with respect to the MM values. WPI – CS 561
Tuple Batching - Evaluation • Inference? • For a burst size of 4, the overhead quadruples. • When the train size is equal to one (the entire queue), the average overhead approaches the overhead for the non bursty case. Train size (x-axis) is given as a percentage of the queue size. Overhead: Total execution time less processing time. In order to isolate the effects of operator scheduling, round-robin BAAT was used for this experiment. WPI – CS 561
Comparison of Execution times • - TAAT (tuple-at-a-time) • - BAAT (tuple trains) • MC (Superbox) • Number at the top shows actual time for processing 100k tuples in the system. • TAAT is significantly worse than the other methods. • Superbox scheduling decreases the overall execution time of the system running tuple-trains almost by 50% • As we go from left to right, the scheduler algorithms become increasingly more intelligent and sophisticated, taking more time to generate the scheduling plans. WPI – CS 561
QoS-Driven Scheduling • Keep track of the latency of tuples that reside at the queues. • Pick the tuples whose execution will provide the most expected increase in the aggregate QoS. • This approach is not scalable (Why?) • Tuple batching will be difficult • High scheduling overhead. • Aurora Scheduler maintains latency information at the granularity of individual boxes • Latency of a box is the averaged latencies of the tuples in its input queue. WPI – CS 561
QoS-Driven Scheduling - Algorithm Expected output latency: Eol(b) = latency(b) + cost(D(b)) Utility: utility(b) = gradient(eol(b)) Expected slack time:est(b), is an indication of how close a box is to a critical point. Critical point: A point where the QoS changes sharply. • Priority is assigned so as to order the boxes in terms of their utility and urgency. • Utility - This value is an estimation of where a box’s tuples currently are on the QoS-latency curve at the corresponding output. When is the Utility lowest and highest? • Urgency – Given by the expected slack time. WPI – CS 561
QoS-Driven Scheduling • Scheduler algorithm: • First choose for execution those boxes that have the highest utility, • Choose from among those that have the same utility, the ones that have the minimum slack time. WPI – CS 561
QoS-Driven Scheduling - Evaluation • Inference? WPI – CS 561
CONCLUSION • Presents an experimental investigation of scheduling algorithms for stream data management systems. • The authors • showed that a naïve scheduling approach of using a thread per box does not scale. • showed that the approach of train scheduling and superbox scheduling help a lot to reduce system overheads. • addressed QoS issues and extended their basic algorithms to address application-specific QoS expectations. WPI – CS 561