1 / 43

Public Clouds (EC2, Azure, Rackspace, …)

Public Clouds (EC2, Azure, Rackspace, …). Multi-tenancy Different customers’ virtual machines (VMs) share same server. VM. VM. VM. VM. VM. VM. VM. Tenant: Why Cloud? Pay-as-you-go Infinite Resources Cheaper Resources. Provider: Why multi-tenancy? Improved resource utilization

yin
Download Presentation

Public Clouds (EC2, Azure, Rackspace, …)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Public Clouds (EC2, Azure, Rackspace, …) Multi-tenancy Different customers’ virtual machines (VMs) share same server VM VM VM VM VM VM VM • Tenant: Why Cloud? • Pay-as-you-go • Infinite Resources • Cheaper Resources • Provider: Why multi-tenancy? • Improved resource utilization • Benefits of economies of scale

  2. Implications of Multi-tenancy • VMs share many resources • CPU, cache, memory, disk, network, etc. • Virtual Machine Managers (VMM) • Goal: Provide Isolation • Deployed VMMs don’t perfectly isolate VMs • Side-channels [Ristenpart et al. ’09, Zhang et al. ’12] VM VMM VM

  3. Lies to by the Cloud • Infinite resources • All VMs are created equally • Perfect isolation

  4. This Talk Taking control of where your instances run • Are all VMs created equally? • How much variation exists and why? • Can we take advantage of the variation to improve performance? Gaining performance at any cost • Can users impact each other’s performance? • Is there a way to maliciously steal another user’s resource? • Is tehre

  5. Heterogeneity in EC2 • Cause of heterogeneity: • Contention for resources: you are sharing! • CPU Variation: • Upgrades over time • Replacement of failed machined • Network Variation: • Different path lengths • Different levels of oversubscription

  6. Are All VMs Created Equally? • Inter-architecture: • Is there differences between architectures • Can this be used to predict perform aprior? • Intra-architecture: • Within an architecture • If large, then you can’t predict performance • Temporal • On the same VM over time? • There is no hope!

  7. Benchmark Suite & Methodoloy • Methodology: • 6 Workloads • 20 VMs (small instances) for 1 week • Each run micro-benchmarks every hour

  8. Inter-Architecture

  9. Intra-Architecture CPU is predictable – les than 15% Storage is unpredictable --- as high as 250%

  10. Temporal

  11. Overall CPU type can only be used to predict CPU performance For Mem/IO bound jobs need to empirically learn how good an instance is

  12. What Can We Do about it? • Goal: Run VM on best instances • Constraints: • Can control placement – can’t control which instance the cloud gives us • Can’t migrate • Placement gaming: • Try and find the best instances simply by starting and stopping VMs

  13. Measurement Methodology • Deploy on Amazon EC2 • A=10 instances • 12 hours • Compare against no strategy: • Run initial machines with no strategy • Baseline varies for each run • Re-use machines for strategy

  14. EC2 results 16 migrations Records/sec MB/sec Apache Runs NER Runs

  15. Placement Gaming • Approach: • Start a bunch of extra instances • Rank them based on performance • Kill the under performing instances • Performing poorer than average • Start new instances. • Interesting Questions: • How many instances should be killed in each round? • How frequently should you evaluate performance of instances.

  16. Resource-Freeing Attacks:Improve Your Cloud Performance(at Your Neighbor's Expense) (Venkat)anathan Varadarajan, Thawan Kooburat, Benjamin Farley, Thomas Ristenpart, and Michael Swift Department of Computer Sciences

  17. Contention in Xen • Same Core • Same core & same L1 Cache & Same memory • Same Package • Diff core but share L1 Cache and memory • Different Package • Diff core & diff Cache but share Memory

  18. I/O contends with self • VMs contend for the same resource • Network with Network: • More VMs  Fair share is smaller • Disk I/O with Disk I/O: • More disk access  longer seek times • Xen does N/W batching to give better performances • BUT: this adds jitter and delay • ALSO: you can get more than your fairshare because of the batch

  19. I/O contends with self • VMs contend for the same resource • Network with Network: • More VMs  Fair share is smaller • Disk I/O with Disk I/O: • More disk access  longer seek times • Xen does N/W batching to give better performances • BUT: this adds jitter and delay • ALSO: you can get more than your fairshare because of the batch

  20. Everyone Contends with Cache • No contention on same core • VMs run in serial so access to cache is serial • No contention on diff package • VMs use different cache • Lots of contention when same package • VMs run in parallel but share same cache

  21. Contention in Xen • 3x-6x Performance loss  Higher cost VM Work-conserving scheduling VM Non-work-conserving CPU scheduling

  22. What can a tenant do? Ask provider for better isolation … requires overhaul of the cloud VM Pack up VM and move (See our SOCC 2012 paper) … but, not all workloads cheap to move VM This work: Greedy customer can recover performance by interfering with other tenants Resource-Freeing Attack

  23. Resource-freeing attacks (RFAs) • What is an RFA? • RFA case studies • Two highly loaded web server VMs • Last Level Cache (LLC) bound VM andhighly loaded webserver VM • Demonstration on Amazon EC2

  24. The Setting Victim: • One or more VMs • Public interface (eg, http) Beneficiary: • VM whose performance we want to improve Helper: • Mounts the attack Beneficiary and victim fighting over a target resource Victim VM VM Beneficiary Helper

  25. Example: Network Contention • Beneficiary&Victim • Apache webservers hosting static and dynamic (CGI) web pages. • Target Resource: Network Bandwidth • Work-conserving scheduler • network bandwidth Clients Beneficiary Victim Local Xen Test bed Net What can you do?

  26. Recipe for a Successful RFA Shift resource away from the target resource towards the bottleneck resource CPU intensive dynamic pages Proportion of CPU usage Limits Shift resource usage via public interface Push towards CPU bottleneck Static pages Proportion of Network usage Reduce target resource usage

  27. An RFA in Our Example Result in our testbed: Increases beneficiary’s share of bandwidth No RFA: 1800 page requests/sec W/ RFA: 3026 page requests/sec CPU Utilization Clients Net CGI Request 50% 85%share of bandwidth Helper

  28. Resource-freeing attacks 1) Send targeted requests to victim 2) Shift resources use from target to a bottleneck Shared CPU Cache: • Ubiquitous: Almost all workloads need cache • Hardware controlled: Not easily isolated via software • Performance Sensitive: High performance cost! Can we mount RFAs when target resource is CPU cache?

  29. Cache Contention RFA Goal

  30. Case Study: Cache vs. Network • Victim: Apache webserver hosting static and dynamic (CGI) web pages • Beneficiary: Synthetic cache bound workload (LLCProbe) • Target Resource: Cache • No cache isolation: • ~3x slower when sharing cache with webserver Clients Beneficiary Victim $$$ Local Xen Test bed Core Core Net Cache

  31. Cache vs. Network Victim webserver frequently interrupts, pollutes the cache • Reason: Xen gives higher priority to VM consuming less CPU time $$$ Clients Core Core Net Cache Beneficiary starts to run decreased cache efficiency cache state Cache state time line Heavily loaded web server Webserver receives a request

  32. Cache vs. Network w/ RFA RFA helps in two ways: • Webserver loses its priority. • Reducing the capacity of webserver. $$$ Clients Core Core Net Cache Beneficiary starts to run CGI Request cache state Cache state time line Heavily loaded web server Webserver receives a request Helper Heavily loaded webserver requests under RFA

  33. RFA: Performance Improvement 60% Performance Improvement RFA intensities – time in msper second 196% slowdown 86% slowdown

  34. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  35. Limitations • Experiments setup: • Only 1 VMs in each experiment • Don’t vary the number of each type of job

  36. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  37. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  38. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  39. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  40. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  41. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  42. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

  43. Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests • Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims • Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? • Paper discusses possibilities for RFAs VM VM

More Related