220 likes | 242 Views
Resource Overbooking and Application Profiling in Shared Hosting Platforms. Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst and Intel Research †. Internet. Motivation. cluster. E-commerce. Proliferation of Internet applications
E N D
Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst and Intel Research †
Internet Motivation cluster E-commerce • Proliferation of Internet applications • Electronic commerce, streaming media, online games, online trading,… • Commonly hosted on clusters of servers • Cheaper alternative to large multiprocessors Streaming Clients Games
Hosting Platforms • Hosting platform: server cluster that runs third-party applications • Application providers pay for server resources • CPU, disk, network bandwidth, memory • Platform provider guarantees resource availability • Performance guarantees provided to applications • Central challenge: Maximize revenue while providing resource guarantees
Design Challenges • How to determine an application’s resource needs? • How to provision resources to meet these needs? • How to map applications to nodes in the platform? • How to handle dynamic variations in the load?
Talk Outline • Introduction • Inferring Resource Requirements • Provisioning Resources • Handling Dynamic Load Variations • Experimental Evaluation • Related Work
Hosting Platform Model • Hosting Platforms: Dedicated vs Shared • Dedicated: Applications get integral # nodes • Shared: Applications may get fractional # nodes • Our focus: Shared Hosting Platforms • Nodes may have competing applications • Capsule: component of an application running on a node • Example: e-commerce application: HTTP server, app server, database server
Provisioning By Overbooking • How should the platform allocate resources? • Provision resources based on worst-case needs • Worst-case provisioning is wasteful • Low platform utilization • Applications may be tolerant to occasional violations • E.g., CPU guarantees should be met 99% of the time • Possible to provide useful guarantees even after provisioning less than worst-case needs • Idea: Improve utilization by overbooking resources
Application Profiling • Profiling: process of determining resource usage • Run the application on an isolated set of nodes • Subject the application to a real workload • Model CPU and network usage as ON-OFF processes • Use the Linux trace toolkit Begin CPU quantum End CPU quantum time ON OFF
1 0.99 Cumulative Probability Probability r(100) r(99) 0 1 0 1 Fractional usage Fractional usage Resource Usage Distribution Measurement Interval time
Capturing Burstiness: Token Bucket • Token Bucket (σ, ρ) • Resource usage over t ≤ σ.t + ρ σ1.t + ρ1 σ2.t + ρ2 usage ρ2 ρ1 Algorithm by Tang et al time • Additional parameter T • Satisfy token bucket guarantees only for t ≥ T
Streaming Media Server, 20 clients Postgres Server, 10 clients Apache Web Server, 50% cgi-bin 0.3 0.1 0.3 0.25 0.25 0.08 0.2 0.2 Probability Probability 0.06 Probability 0.15 0.15 0.04 0.1 0.1 0.02 0.05 0.05 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.2 0.4 0.6 0.8 1 0 0.1 .2 0.3 0.4 0.5 0.6 0.7 0.8 Fraction of CPU Fraction of NW bandwidth Fraction of CPU Profiles of Server Applications • Applications exhibit different degrees of burstiness • May have a long tail • Insight: Choose (σ, ρ) based on a high percentile
Resource Overbooking • Applications specify overbooking tolerance O • Probability with which capsule needs may be violated • Controlled overbooking via admission control: ΣK(σk·Tmin+ρk)·(1 - Ok) ≤ C·Tmin Pr (ΣKUk > C) ≤ min (O1,…,Ok) • A node that has sufficient resources for a capsule is feasible for it
Mapping Capsules to Nodes 1 1 1 1 • A bipartite graphs of capsules and feasible nodes • Greedy mapping: consider capsules in non-decreasing order of degrees: O( c . Log c ) • Guaranteed to find a placement if one exists! • Multiple feasible nodes => best fit, worst fit, random… 2 2 Final Mapping 2 3 3 3 3 4 4 capsules capsules nodes nodes
Apache Web Server, Overload Apache Web Server, Offline Profile Apache Web Server, Expected Workload 0.3 0.3 0.3 0.25 0.25 0.25 0.2 0.2 0.2 Probability Probability Probability 0.15 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0.05 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Fraction of CPU Fraction of CPU Fraction of CPU Handling Flash Crowds • Detect overloads by online profiling • Reacting to overloads (ongoing work) • Compute new allocations • Change allocations, move capsules, add servers
Talk Outline • Introduction • Inferring Resource Requirements • Provisioning Resources • Handling Dynamic Load Variations • Experimental Evaluation • Related Work
The SHARC Prototype • A Linux-based Shared Hosting Platform • 6 Dell Poweredge 1550 servers • Gigabit Ethernet link • Software Components • Profiling • Vanilla Linux + Linux trace toolkit • Control plane • Overbooking, placement • QoS-enhanced Linux kernel • HSFQ schedulers
Experimental Setup • Prototype running on a 5 node cluster • Each server: 1 GHz PIII with 512MB RAM and Gigabit ethernet • Control plane runs on a dedicated node • Applications run on the other four nodes • Workload: mix of server applications • PostgreSQL database server with pgbench (TPC-B) benchmark • Apache web server with SPECWeb99 (static & dynamic HTTP) • MPEG streaming server with 1.5 Mb/s VBR MPEG-1 clients • Quake I game server with “terminator” bots
Placement of Apache Web Servers Resource Overbooking Benefits • Small amounts of overbooking can yield large gains • Bursty applications yields larger benefits
Capsule Placement Algorithms • Diverse requirements: worst-fit outperforms others • Similar requirements: all perform similarly
Performance with Overbooking • Performance degradation is within specified overbooking tolerance
Related Work • Single node resource management • Proportional share schedulers: WFQ, SFQ, BVT, … • Reservation based schedulers: Nemesis, Rialto, … • Cluster-based resource management • Cluster Reserves [Aron00], Aron thesis [Aron00] • MUSE [Chase01]: economic approach • Oceano [IBM], Planetary computing [HP] • Clusters for high availability: Porcupine [Saito99] • Grid computing
Concluding Remarks • Resource management in shared hosting platforms • Application profiling to determine resource usage • Revenue maximization using controlled overbooking • Ability to handle dynamic workloads (ongoing work) • URL: http://lass.cs.umass.edu