220 likes | 534 Views
Seeding Cloud-based services: Distributed Rate Limiting (DRL). Kevin Webb , Barath Raghavan , Kashi Vishwanath , Sriram Ramabhadran , Kenneth Yocum , and Alex C. Snoeren. Seeding the Cloud. T echnologies to deliver on the promise cloud computing
E N D
Seeding Cloud-based services:Distributed Rate Limiting (DRL) Kevin Webb, BarathRaghavan, KashiVishwanath, SriramRamabhadran, Kenneth Yocum, and Alex C. Snoeren
Seeding the Cloud Technologies to deliver on the promise cloud computing Previously: Process data in the cloud (Mortar) • Produced/stored across providers • Find Ken Yocum or Dennis Logothetis for more info Today: Control resource usage: “cloud control” with DRL • Use resources at multiple sites (e.g., CDN) • Complicates resource accounting and control • Provide cost control
DRL Overview • Example: Cost control in a Content Distribution Network • Abstraction: Enforce global rate limit across multiple sites • Simple example: 10 flows, each limited as if there was a single, central limiter 10 flows Src Dst Limiter 100 KB/s 2 flows Src Dst Limiter 20 KB/s DRL 80 KB/s Src Dst Limiter 8 flows
Goals & Challenges • Up to now • Develop architecture and protocols for distributed rate limiting (SIGCOMM 07) • Particular approach (FPS) is practical in the wide area • Current goals: • Move DRL out of the lab and impact real services • Validate SIGCOMM results in real-world conditions • ProvideInternet testbed with ability to manage bandwidth in a distributed fashion • Improve usability of PlanetLab • Challenges • Run-time overheads: CPU, memory, communication • Environment: link/node failures, softwarequirks
PlanetLab • World-wide test bed • Networking and systems research • Resources donated by Universities, Labs, etc. • Experiments divided intoVMs called “slices” (Vservers) Controller Web server PLC API PostgreSQL Linux 2.6 Internet Slice 2 Slice 2 Slice 1 Slice 1 Slice N Slice N Vservers Vservers Linux 2.6 Linux 2.6 Nodes
PlanetLab Use Cases • PlanetLab needs DRL! • Donated bandwidth • Ease of administration • Machine room • Limit local-area nodes to a single rate • Per slice • Limit experiments in the wide area • Perorganization • Limit all slices belonging to an organization
PlanetLab Use Cases • Machine room • Limit local-area nodes with a single rate DRL 1 MBps DRL 1 MBps 5 MBps 1 MBps DRL 1 MBps 1 MBps DRL DRL
DRL Design • Each limiter - main event loop • Estimate: Observe and record outgoing demand • Allocate: Determine rate share of each node • Enforce: Drops packets • Two allocation approaches • GRD: Global random drop (packet granularity) • FPS: Flow proportional share • Flow count as proxy for demand Input Traffic Estimate Other Limiters FPS Allocate Regular Interval Enforce Outputtraffic
Implementation Architecture • Abstractions • Limiter • Communication • Manages identities • Identity • Parameters (limit, interval, etc.) • Machines and Subsets • Built upon standard Linux tools… • Userspace packet logging (Ulogd) • Hierarchical Token Bucket • Mesh & gossip update protocols • Integrated with PlanetLab software Input Data Ulogd Estimate FPS Regular Interval Enforce HTB Output Data
Estimation using ulogd • Userspace logging daemon • Already used by PlanetLab for efficientabuse tracking • Packets tagged with slice ID by IPTables • Receives outgoing packet headers via netlink socket • DRL implemented as ulogdplug-in • Gives us efficient flow accounting for estimation • Executes the Estimate, Allocate, Enforce loop • Communicates with other limiters
Enforcement with Hierarchical Token Bucket • Linux Advanced Routing & Traffic Control • Hierarchy of rate limits • Enforces DRL’s rate limit • Packetsattributed to leaves (slices) • Packets move up, borrowing from parents Packet (1500) 1000b 200b 0b 100b X D B C Z A Y Root 600b 0b Packet (1500b)
Enforcement with Hierarchical Token Bucket • Uses same tree structure as PlanetLab • Efficient control ofsub-trees • Updated every loop • Root limits whole node • Replenish each level B C D Y Z A X Root
Citadel Site • The Citadel (2 nodes) • Wanted 1 Mbps traffic limit • Added (horrible) traffic shaper • Poor responsiveness (2 – 15 seconds) • Running right now! • Cycles on and off every four minutes • Observe DRL’s impact without ground truth DRL Shaper
Citadel Results – Outgoing Traffic 1Mbit/s Outgoing Traffic Off On Off On Off On Off On Time • Data logged from running nodes • Takeaways: • Without DRL, way over limit • One node sending more than other
Citadel Results – Flow Counts # of Flows Time • FPS uses flow count as proxy for demand
Citadel Results – Limits and Weights Rate Limit FPS Weight Time
Lessons Learned • Flow counting is not always the best proxy for demand • FPS state transitions were irregular • Added checks and dampening/hysteresis in problem cases • Can estimate after enforce • Ulogd only shows packets after HTB • FPS is forgiving to software limitations • HTB is difficult • HYSTERESIS variable • TCP Segmentation offloading
Ongoing work • Other use cases • Larger-scale tests • Complete PlanetLab administrative interface • Standalone version • Continue DRL rollout on PlanetLab • UCSD’sPlanetLab nodes soon
Questions? • Code is available from PlanetLabsvn • http://svn.planet-lab.org/svn/DistributedRateLimiting/