210 likes | 240 Views
Adaptive Overload Control for Busy Internet Servers. Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar. Internet Services Today. Massive concurrency demands Yahoo: 1.2 billion+ pageviews/day AOL web caches: 10 billion hits/day Load spikes are inevitable
E N D
Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar
Internet Services Today • Massive concurrency demands • Yahoo: 1.2 billion+ pageviews/day • AOL web caches: 10 billion hits/day • Load spikes are inevitable • Peak load is orders of magnitude greater than average • Traffic on September 11, 2001 overloaded many news sites • Load spikes occur exactly when the service is most valuable! • In this regime, overprovisioning is infeasible • Increasingly dynamic • Days of the “static” web are over • Majority of services based on dynamic content • E-commerce, stock trading, driving directions etc.
Problem Statement • Supporting massive concurrency is hard • Threads/processes don’t scale very well • Static resource containment is inflexible • How to set a priori resource limits for widely varying loads? • Load management demands a feedback loop • Replication alone does not solve the load management problem • Individual nodes may still face huge variations in demand
Proposal: The Staged Event-Driven Architecture • SEDA: A new architecture for Internet services • A general-purpose framework for high concurrency and load conditioning • Decomposes applications into stages separated by queues • Enable load conditioning • Event queues allow inspection of request streams • Can perform prioritization or filtering during heavy load • Apply control for graceful degradation • Perform load shedding or degrade service under overload
Staged Event-Driven Architecture • Decompose service into stages separated by queues • Each stage performs a subset of request processing • Stages internally event-driven, typically nonblocking • Queues introduce execution boundary for isolation • Each stage contains a thread pool to drive stage execution • Dynamic control grows/shrinks thread pools with demand
Per-stage admission control • Admission control done at each stage • Failure to enqueue a request => backpressure on preceding stages • Application has flexibility to respond as appropriate • Less conservative than single AC
Response time controller • 90th percentile response time over some interval passed to the controller • AIMD heuristic used to determine token bucket rate • Exact scheduling mechanisms unspecified • Future work: Automatic tuning of parameters
Overload management • Class based differentiation • Segragate request processing for each class into its own set of stages • Or, have a common set of stages but make the admission controller aware of the classes • Service degradation • SEDA signals occurrence of overload to applications • If application wants it may degrade service
Arashi: A SEDA-based email service • A web-based email service • Managing folders, deleting/refiling mails, search etc • Client workload emulates several simultaneous users, user behavior derived from traces of the UCB CS IMAP server
Advantages of SEDA • Exposure of the request stream • Request level performance made available to application • Focused, application-specific admission control • Fine-grained admission control at each stage • Application can provide own admission control policy • Modularity and performance isolation • Inter-stage communication via event passing enables code modularity
Shortcomings • Biggest shortcoming: Heuristic based • May work for some applications, fail for others • Not completely self-managed • Response time targets supplied by administrator • Controller parameters set manually • Limited to apps based on the SEDA approach • Evaluation of overheads missing • Exact scheduling mechanisms missing
Some thoughts/directions… • Formal ways to reason about the goodness of resource management policies • Also, the distinction between transient and drastic/persistent overloads • Policy issues: Revenue maximization and predictable application performance • Designing Service Level Agreements • Mechanisms to implement them • Application modeling and workload prediction
Overload control: a big picture Unavoidable overload Avoidable overload Underload • Detection of overloads • Formal and rigorous ways of defining the goodness of “self-managing” techniques • UO and AO involve different actions (e.g. admission control versus reallocation). Are they fundamentally different?
Knowing where you are! • Distinguish avoidable overloads from unavoidable overloads • Need accurate application models, workload predictors • Challenges: multi-tiered applications, multiple resources, dynamically changing appl behavior • Simple models based on networks of queues? • How good would they prove? Performance Goal MODEL Resource allocations Workload (predicted)
Workload prediction: a simple example • A static application model • Find cpu and nw usage distributions by offline profiling • Use the 99th percentiles as cpu, nw requirements • When the application runs “for real” • We don’t get to see what the tail would have been • So … resort to some prediction techniques • E.g., a web server: • record # requests N • record # requests serviced M • extrapolate to predict the cpu, nw requirements of N requests
Service-level agreements • We may want… Workload Response time w1 r1 w2 r2 … wN rN • Is this possible to achieve? Maybe not. • How about: Response time Revenue/request r1 $$1 … rN $$N