180 likes | 291 Views
Budget-based Control for Interactive Services with Partial Execution. Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research. Motivation. Interactive services specify stringent SLA on response time Long response time causes user dissatisfaction and revenue loss
E N D
Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research
Motivation • Interactive services specify stringent SLA on response time • Long response time causes user dissatisfaction and revenue loss • Important to bound response time (e.g. mean, 95-percentile) • Address two challenges • Adapt to dynamic and changing environment • Achieve high response quality Goal: Develop a self-managed scheduling system to meet response time target while achieving high quality.
Existing Techniques (1) • Static admission control approach • Define a fixed queue length limit; drop requests when queue is full. • Issues • Only works under a static system. • Determining an appropriate queue-length for every setting and load is challenging. • Small queue length => underutilize resources • Large queue length => long response time • Can not adapt to dynamic and changing environment.
Existing Techniques (2) • Classic feedback control approach: • Feedback control on queue length • Decrease queue length when response time is above target • Issue • Dropping requests results in degraded quality • Does not consider partial execution of requests
Partial Execution & Response Quality • Incomplete execution of requests may still return meaningful partial results • Many interactive services support partial execution • Web search, web server, video streaming, finance server • Quality profile • A function maps request execution time to response quality
Our Contributions • Propose a budget-based control model for interactive services with partial execution • Use feedback control to meet response time target • Apply optimization procedure to improve response quality • Exploit partial execution and request quality profile • Evaluation • Implementation at Bing search server • Simulation on finance server
Budget-based Control Model • Control Variable • Budget: amount of computation time for all pending requests • Control mechanism • Determine the budget based on response time feedback • Control budget to meet response time • Optimization procedure • Given a budget, assign processing time to requests • Exploit partial results of a query • Scheduling to improve quality
Control Mechanism • Basic idea • If response time is larger than target, smaller budget • If response time is smaller than target, larger budget • Criteria • Meet response time target accurately and quickly • Incur little runtime overhead.
Control Mechanism: Background • Integral control • Adjust budget based on the difference between the observed and target response time • Advantage: eliminate steady-state error • Limitation: response is slow (long settling time) • Adaptive control • Model estimator + Linear quadratic optimal controller • Advantage: quick adaptation, fast response • Limitation: computationally expensive, stead-state error
Control Mechanism: Hybrid Control • Combine the integral and adaptive control • Run adaptive control periodically in a coarse-grain time interval • Use integral control for execution of each request for fine-grain adjustment • Meet our goal • Quick and accurate adaptation • Little runtime overhead.
Optimization Procedure • Objective: maximize total response quality • Input: budget, pending requests • Output: assigned processing time to requests • Optimization procedure depends on applications
Bing index server • Core part of Bing search • For a user query, match and rank docs, return top results • Concave quality profile • First-half of request execution receives higher quality gain than the second half.
Optimization Procedure for Index Server • Run the portion of requests with higher gain • Prevent long requests from starving short ones • Combine two techniques • Reservation at light load: • Reserve time for later requests in the queue based on mean service demand • Equal sharing at heavy load: • Allocate resource equally among requests
Evaluation • Implemented and evaluated at Bing index server • Meet response time target and achieve high quality • Simulation study on finance server • Double system throughput at desired quality
Bing Index Server • Implementations • BudgetIS • Feedback control on budget • Hybrid control + optimization procedure • QueueIS • Feedback control on queue length • Evaluation • Production trace
Compare Queue v.s. Budget Approach Mean response time = 35ms • Budget approach • Meet response time accurately • Achieve high quality
Conclusion • Propose a budget-based control optimization model for interactive services with partial execution • Hybrid control mechanism to meet response time target • Optimization procedure to improve response quality • Evaluation • Implemented and evaluated at Bing index server • Meet response time target and achieve high quality • Simulation study on finance server • Double system throughput at desired quality