200 likes | 534 Views
Integrated Resource Management for Cluster-based Internet Services. Kai Shen Dept. of Computer Science Univ. of Rochester. Hong Tang, Tao Yang*, Lingkun Chu Dept. of Computer Science Univ. of California, Santa Barbara * : Ask Jeeves, Inc. Background.
E N D
Integrated Resource Management for Cluster-based Internet Services Kai Shen Dept. of Computer Science Univ. of Rochester Hong Tang, Tao Yang*, Lingkun Chu Dept. of Computer Science Univ. of California, Santa Barbara *: Ask Jeeves, Inc.
Background • Large-scale resource-intensive Internet services hosted on server clusters. • Yahoo, MSN, Google, Teoma/Ask Jeeves … • Challenges/requirements for resource management: • Scalability and robustness; • Online users require interactive responses; • Resource (CPU, IO)–hungry service processing and large user traffic require efficient resource utilization; • Fluctuating user traffic requires adaptive management; • Supporting differentiated services to different types of user requests. OSDI 2002
Architecture of Targeted Services:Document Search Engine Index servers (partition 1) Query caches Firewall/ Web switch Local-area network Index servers (partition 2) Web server/ Query handlers Index servers (partition 3) Doc servers OSDI 2002
“Neptune” Project Overview • Programming and runtime support to aggregate and replicate stand-alone service components. • Building blocks forscalable and robust service constructions: • Functionally-symmetric clustering architecture; • Integrated resource management – quality, efficiency, and differentiation; • Replication management. OSDI 2002
Neptune runtime Neptune runtime SAP SAP Architecture of Targeted Services:Document Search Engine Index servers (partition 1) Query cache Firewall/ Web switch Local-area network Index servers (partition 2) Web server/ Query handlers Index servers (partition 3) Doc servers OSDI 2002
Neptune Deployments • Service deployments: • Web document searching; • BLAST – protein sequence similarity matching; • Prototype database services – online discussion group, auction. • Production system at search enginesTeoma/Ask Jeeves since 2000: • search indexes of more than 450M Web documents; • over 800 multiprocessor servers; • tens of millions of search queries per day. OSDI 2002
Outline • Project Overview • Integrated Resource Management • Multiple Resource Management Objectives • Two-level Mechanism • Trace-driven Performance Evaluation on a Linux Cluster • Related Work and the Conclusion OSDI 2002
Quality-aware Resource Utilization Efficiency • Throughput: measure resource utilization efficiency. • Service response time: measure client-perceived service quality. • Aggregate service yield: measure quality-aware resource utilization efficiency. • Fulfillment of each service request generates quality-aware service yield – a function of service response time. • Service yield function– specified by service providers (flexibility). • System goal – maximizing aggregate service yield: OSDI 2002
<A> Maximizing throughput (with a deadline) Constant yield Service yield Response time 0 0 Deadline <B> Minimizing mean response time (with a deadline) <C> A hybrid metric Full yield Full yield Service yield Service yield Drop penalty Response time Response time 0 0 0 Full-yield deadline Deadline 0 Deadline Sample Service Yield Functions QoS yield QoS yield QoS yield OSDI 2002
Service Differentiation • Service class – a category of service accesses that enjoy the same level of QoS support. • Client identities: paid vs unpaid, consumers vs corporate partners. • Service types or data partitions: order placement vs catalog browsing. • Service differentiation in Neptune • Differentiated service yield function. • Proportional resource allocation guarantee. OSDI 2002
Two-level Resource Management OSDI 2002
Cluster-level: Partitioning or Not? • Periodic Server Partitioning [Zhu2001]: • Determine resource allocation at each epoch. • Partition the server pool among service classes. • Neptune – does not partition servers at cluster-level: • Random polling-based load balancing to evenly distribute requests for each service class to all nodes service differentiation inside each node. • Advantages: • Functional-symmetry and decentralization robustness and scalability. • Better handling of system state changes: demand spikes and node failures. • Disadvantage: • Less isolation for misbehaved service classes. OSDI 2002
Drop requests likely generating zero yield Search for under-allocated service class Schedule the under-allocated service class Yes Found ? No Schedule for high aggregate yield Node-level Request Scheduling OSDI 2002
Scheduling for High Aggregate Yield • Offline optimal scheduling is NP-complete. OSDI 2002
Evaluation Settings • Evaluation platform • A cluster of Linux servers connected by switched Ethernet. • Workload I: trace-driven • Document search on a 2.5GB memory-mapped search index. • Based on 1.5M search queries selected from an one-week access trace at Ask Jeeves search in January 2002. • “Service yield”-based priority order: Gold > Silver > Bronze. • Workload II: • CPU-spinning micro-benchmark. • Poisson process arrival; exponentially-distributed service processing time. QoS yield OSDI 2002
Evaluation on Scheduling Policies (16 nodes aggregate) Performance Metric: (B) Overload (A) Underload EDF 6% 60% YID Loss percent Loss percent Greedy 45% Adaptive 4% 30% EDF YID 2% Lost percent Lost percent 15% Greedy • EDF and YID perform better than Greedy during system under-load; Greedy performs better during system overload. • Adaptive dynamically switches between YID and Greedy to achieve good performance under both situations. Adaptive Aggregated yield (normalized) Aggregated yield (normalized) Aggregated yield (normalized) Aggregated yield (normalized) 0% 0% 0% 25% 50% 75% 100% 100% 125% 150% 175% 200% Aggregated yield (normalized) Aggregated yield (normalized) Arrival demand Arrival demand OSDI 2002
Gold demand Silver demand Bronze demand Gold acquisition Silver acquisition Bronze acquisition Service Differentiation during a Demand Spike and a Node Failure (8 nodes) CPU demand/acquisition In percentage to total system resource 100% 80% 60% 40% 20% • “Service yield”-based priority order: Gold > Silver > Bronze. • 20% proportional resource guarantee for low-priority Bronze class. • Demand spike for the Silver class between time 50 and 150. • One node fails at time 200 and recovers at 250. Resource demand/acquisition Resource demand/acquisition 0% 0 50 100 150 200 250 300 Timeline (seconds) OSDI 2002
Performance Scalability <A> Differentiated Search <B> Micro-benchmark 20 20 Aggregated yield (normalized) Aggregated yield (normalized) Demand 200% Demand 200% Demand 125% Demand 125% 15 15 Demand 75% Demand 75% 10 10 5 5 Aggregate yield (normalized) Aggregate yield (normalized) 0 0 0 5 10 15 20 0 5 10 15 20 Number of service nodes Number of service nodes OSDI 2002
Related Work • Software infrastructure for cluster-based Internet services – TACC [Fox1997], MultiSpace [Gribble1999], Porcupine [Saito1999], Ninja [von Behren2002]. • QoS and service differentiation in computer networks – Weighted Fair Queuing [Demers1990; Parekh1993], Leaky Bucket, LIRA [Stoica1998], [Dovrolis1999]. • QoS or real-time scheduling at the single host level – [Huang1989], [Haritsa1993], [Waldspurger1994], [Mogul1996], LRP [Druschel96], [Jones97], Eclipse [Bruno1998], Resource Container [Banga1999], [Steere1999]. • Resource management and QoS for Web servers – [Almeida1998], [Pandey1998], [Abdelzaher1999], [Bhatti1999], [Chandra2000], [Li2000], [Voigt2001]. • Resource management for clustered servers – LARD [Pai1998], Cluster Reserves [Aron2000], [Sullivan2000], DDSD [Zhu2001], [Chase2001]. OSDI 2002
Conclusion • Multiple resource management objectives: • quality-aware resource utilization efficiency • service differentiation • Two-level resource management mechanism: • non-partitioning at the cluster level • adaptive scheduling at the node level • Trace-driven evaluations. • Future work – other types of service qualities. OSDI 2002