1 / 61

David Culler* UC Berkeley Intel Research @ Berkeley

PlanetLab: an open community test-bed for Planetary-Scale Services a “work in progress” for USITS03. David Culler* UC Berkeley Intel Research @ Berkeley. with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, Frans Kaashoek, Mike Wawrzoniak,. PlanetLab today.

charleslong
Download Presentation

David Culler* UC Berkeley Intel Research @ Berkeley

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PlanetLab: an open community test-bed for Planetary-Scale Servicesa “work in progress” for USITS03 David Culler* UC Berkeley Intel Research @ Berkeley • with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, • Frans Kaashoek, Mike Wawrzoniak, ....

  2. PlanetLab today • 121 nodes at 52 sites in 10 countries, 4 continents, ... • Universities, Internet 2, co-lo’s soon • Active and growing research community • Just beginning... ... on way to 1,000 http://www.planet-lab.org USITS PlanetLab

  3. Where did it come from? • Sense of wonder • what would be the next important thing to do in extreme networked systems post cluster, post yahoo, post inktomi, post akamai, post gnutella, post bubble? • Sense of angst • NRC: “looking over the fence at networks” • ossified internet (intelluctually, infrastructure, system) • next internet likely to emerge as overlay on current one (again) • it will be defined by its services, not its transport • Sense of excitement • new class of services & applns that spread over much of the web • CDN’s, P2P’ss just the tip of the iceberg • architectural concepts emerging • scalable translation, dist. storage, dist. events, instrumentation, caching, management USITS PlanetLab

  4. key missing element – hands on experience • Researchers had no vehicle to try out their next n great ideas in this space • Lot’s of simulations • Lot’s of emulation on large clusters • emulab, millennium, modelnet • Lot’s of folks calling their 17 friends before the next deadline • RON testbed • but not the surprises and frustrations of experience at scale to drive innovation USITS PlanetLab

  5. Guidelines (1) • Thousand viewpoints on “the cloud” is what matters • not the thousand servers • not the routers, per se • not the pipes USITS PlanetLab

  6. Guidelines (2) • and you miust have the vantage points of the crossroads • primarily co-location centers USITS PlanetLab

  7. Guidelines (3) • Each service needs an overlay covering many points • logically isolated • Many concurrent services and applications • must be able to slice nodes => VM per service • service has a slice across large subset • Must be able to run each service / app over long period to build meaningful workload • traffic capture/generator must be part of facility • Consensus on “a node” more important than “which node” USITS PlanetLab

  8. Guidelines (4) • Test-lab as a whole must be up a lot • global remote administration and management • mission control • redundancy within • Each service will require its own remote management capability • Testlab nodes cannot “bring down” their site • generally not on main forwarding path • proxy path • must be able to extend overlay out to user nodes? • Relationship to firewalls and proxies is key Management, Management, Management USITS PlanetLab

  9. Guidelines (5) • Storage has to be a part of it • edge nodes have significant capacity • Needs a basic well-managed capability • but growing to the seti@home model should be considered at some stage • may be essential for some services USITS PlanetLab

  10. Confluence of Technologies • Cluster-based scalable distribution, remote execution, management, monitoring tools • UCB Millennium, OSCAR, ..., Utah Emulab, ... • CDNS and P2Ps • Gnutella, Kazaa, ... • Proxies routine • Virtual machines & Sandboxing • VMWare, Janos, Denali,... web-host slices (EnSim) • Overlay networks becoming ubiquitous • xBone, RON, Detour... Akamai, Digital Island, .... • Service Composition Frameworks • yahoo, ninja, .net, websphere, Eliza • Established internet ‘crossroads’ – colos • Web Services / Utility Computing • Authentication infrastructure (grid) • Packet processing (layer 7 switches, NATs, firewalls) • Internet instrumentation The Time is NOW USITS PlanetLab

  11. March 02 “Underground Meeting” • Intel Research • David Culler • Timothy Roscoe • Sylvia Ratnasamy • Gaetano Borriello • Satya (CMU Srini) • Milan Milenkovic • Duke • Amin Vadat • Jeff Chase • Princeton • Larry Peterson • Randy Wang • Vivek Pai • Rice • Peter Druschel • Utah • Jay Lepreau • CMU • Srini Seshan • Hui Zhang • UCSD • Stefan Savage • Columbia • Andrew Campbell • ICIR • Scott Shenker • Eddie Kohler Washington Tom Anderson Steven Gribble David Wetherall MIT Frans Kaashoek Hari Balakrishnan Robert Morris David Anderson Berkeley Ion Stoica Joe Helerstein Eric Brewer Kubi USITS PlanetLab see http://www.cs.berkeley.edu/~culler/planetlab

  12. Outcome • “Mirror of Dreams” project • K.I.S.S. • Building Blocks, not solutions • no big standards, OGSA-like, meta-hyper-supercomputer • Compromise • A basic working testbed in the hand is much better than “exactly my way” in the bush • “just give me a bunch of (virtual) machines spread around the planet,.. I’ll take it from there” • small distr. arch team, builders, users USITS PlanetLab

  13. design deploy measure Tension of Dual Roles • Research testbed • run fixed-scope experiments • large set of geographically distributed machines • diverse & realistic network conditions • Deployment platform for novel services • run continuously • develop a user community that provides realistic workload USITS PlanetLab

  14. YOU ARE HERE Overlapping Phases 2003 2004 2005 Build a working “sandbox” of significant scale quickly to catalyze the community. 0. seed I. get API & interfaces right II. get underlying arch. and impl. right USITS PlanetLab

  15. Architecture principles • “Slices” as fundamental resource unit • distributed set of (virtual machine) resources • a service runs in a slice • resources allocated / limited per-slice (proc, bw, namespace) • Distributed Resource Control • host controls node, service producer, service consumers • Unbundled Management • provided by basic services (in slices) • instrumentation and monitoring a fundamental service • Application-Centric Interfaces • evolve from what people actually use • Self-obsolescence • everything we build should eventually be replaced by the community • initial centralized services only bootstrap distributed ones USITS PlanetLab

  16. Slice-ability • Each service runs in a slice of PlanetLab • distributed set of resources (network of virtual machines) • allows services to run continuously • VM monitor on each node enforces slices • limits fraction of node resources consumed • limits portion of name spaces consumed • Challenges • global resource discovery • allocation and management • enforcing virtualization • security USITS PlanetLab

  17. Unbundled Management • Partition management into orthogonal services • resource discovery • monitoring system health • topology management • manage user accounts and credentials • software distribution and updates • Approach • management services run in their own slice • allow competing alternatives • engineer for innovation (define minimal interfaces) USITS PlanetLab

  18. Distributed Resource Control • At least two interested parties • service producers (researchers) • decide how their services are deployed over available nodes • service consumers (users) • decide what services run on their nodes • At least two contributing factors • fair slice allocation policy • both local and global components (see above) • knowledge about node state • freshest at the node itself USITS PlanetLab

  19. Application-Centric Interfaces • Inherent problems • stable platform versus research into platforms • writing applications for temporary testbeds • integrating testbeds with desktop machines • Approach • adopt popular API (Linux) and evolve implementation • eventually separate isolation and application interfaces • provide generic “shim” library for desktops USITS PlanetLab

  20. Service-Centric Virtualization USITS PlanetLab

  21. Changing VM landscape • VMs for complete desktop env. re-emerging • e.g., VMware • extremely complete, poor scaling • VM sandboxes widely used for web hosting • ensim, BSD Jail, linux vservers (glunix, ufo, ...) • limited /bin, no /dev, many VMs per FM • limit the API for security • Scalable Isolation kernels (VMMs) • host multiple OS’s on cleaner VM • Denali, Xen • Simple enough to make secure • attack on hosted OS is isolated Savage/Anderson view: security is the most critical requirement, there has never been a truly secure VM, it can only be secure if it has no bugs... USITS PlanetLab

  22. How much to virtualize? • enough to deploy the next planet-lab within a slice on the current one... • enough network access to build network gateways for overlays • Phase 0: unix process as VM • SILK (Scout in Linux Kernal) to provide resource metering, allocation • Phase 1: sandbox • evolved a constrained, secure API (subset) • Phase 2: small isolation kernel with narrow API • some services built on it directly • host linux / sandbox on top for legacy services USITS PlanetLab

  23. Linux Service n Slivers of a Slice: long-term plan XP BSD Service 3 Service 4 Application Interface Service 1 Service 2 Isolation Interface Isolation Kernel • Denali • Xenoserver • VMWare Hardware USITS PlanetLab

  24. Kickoff to catalyze community • Seeded 100 machines in 42 sites July 02 • avoid machine configuration issues • huge set of administrative concerns • Intel Research, Development, and Operations • UCB Rootstock build distribution tools • boot once from floppy to build local cluster • periodic and manual update with local modification • UCB Ganglia remote monitoring facility • aggregate stats from each site, pull into common database • 10 Slices (accounts) per site on all machines • authenticate principal (PIs), delegation of access • key pairs stored in PL central, PIs control which get pushed out • PIs map users to slices • Discovery by web pages • Basic SSH and scripts ... grad students roll what they need USITS PlanetLab

  25. the meta-testbed effect • Emulab / netbed • boot-your-own OS doesn’t scale to unaffiliated site • architecture should permit it virtually • service lives in a slice • offers its own user mgmt, authentication, ... => need to offer virtual machine with virtual chroot ASAP • RON • need access to raw sockets to build gateways • need safe (restricted) access to raw sockets early • need mount • Hard to put a machine in someone else’s site and give out root. • Architecturally, should not need to do it. => pushed VServer and SILK agenda and ... federate without losing identity USITS PlanetLab

  26. Vserver Vserver Vserver Service 1 Service 2 Service n Current Approach (on to phase I) Vserver Vserver Service 3 Service 4 Combined Isolation and Application Interface Linux + Resource Isolation + Safe Raw Sockets + Instrumentation Hardware + Ganglia, InforSpec, ScoutMonitor USITS PlanetLab

  27. vServer experience (Brent Chun) • New set of scaling issues: disk footprint • 1581 directories, 28959 files • VM-specific copy-on-write reduced 29 MB/vm • copied part: 5.6 MB /etc, 18.6 MB /var • 1000 VMs per disk • Current • 222+ per node • 30-40 secs create, 10 secs delete • developing VM preallocate & cache • slice login -> vserver root • Limitations • common OS for all VMs (few calls for multiple OS’s) • user-level NFS mount (MIT’s on it) • incomplete self-virtualization • incomplete resource isolation (eg. buffer cache) • inperfect (but unbroken) kernel security => raised the bar on isolation kernels USITS PlanetLab

  28. SILK (Princeton) • key elements of ANets NodeOS in linux • familiar API • Safe raw sockets • enables network gateways, application overlays • Monitoring • traffic per slice, per node • 5 min snapshots bytes sent/recv per slice x node • Isolation and limits • bandwidth • memory soon USITS PlanetLab

  29. acquire ticket  lease description candidates description ticket reserve description Dynamic Slice Creation N1 . . . N2 Agent Broker Service Manager N3 N4 . . . . . . Nm USITS PlanetLab

  30. BootCD – enabling growth • Constrained linux booted from CD with networking • Knows how to phone home and get signed script • check signature and run • install • chain boot • reboot with special sshd • register first... • grow the testbed and use it too http://www.planet-lab.org/joining/ USITS PlanetLab

  31. A typical day (1/28) USITS PlanetLab

  32. Run up to SIGCOMM USITS PlanetLab

  33. A Slice for a Month (Duke) bytes recv’d per day by nodes bytes sent per day by nodes USITS PlanetLab

  34. So what are people doing? Ping! USITS PlanetLab

  35. Really... • Internet Instrumentation • DHT – scalable lookup, location • Distributed Storage • User-level Multicast • Distributed CDN, Search, ... • and all of them are doing a lot of pinging, copying, and timing • key aspect of an overlay network is to estimate performance characteristics of each virtual link USITS PlanetLab

  36. with the internet in the middle scp 4 MB to MIT, Rice, CIT confirm Padhye SIGCOMM98 83 machines, 11/1/02 Sean Rhea basis for DHT comparison 143 RON+PlanetLab Synthetic Coodinate c/o Frans Kaashoek USITS PlanetLab 110 machine, c/o Ion Stoica i3 weather service

  37. Analysis of Tapestry (Ben Zhao) • 98 machines, 6-7 Tapestry nodes per machine, all node pairs • Ratio of end-to-end routing latency to shortest ping time between node • Ratio of object location to ping • 10,000 objects per node Median=31.5, 90th percentile=135 90th percentile=158 USITS PlanetLab

  38. Towards an instrumentation service • every overlay, DHT, and multicast is measuring the internet in the middle • they do it in different ways • they do different things with the data • Can this be abstracted into a customizable instrumentation service? • Share common underlying measurements • Reduce ping, scp load • Grow down into the infrastructure USITS PlanetLab

  39. Ossified or fragile? • One group forgot to turn off an experiment • after 2 weeks of router being pinged every 2 seconds, ISP contacted ISI and threatened to shut them down. • One group failed to initialize destination address and ports (and had many virtual nodes on each of many physical nodes) • worked OK when tested on a LAN • trashed flow-caches in routers • probably generated a lot of unreachable destination traffic • triggered port-scan alarms at ISPs (port 0) • n^2 probe packets trigger other alarms USITS PlanetLab

  40. the Gaetano advice • for this to be successful, it will need the support of network and system administrators at all the sites... • it would be good to start by building tools that made their job easier USITS PlanetLab

  41. ScriptRoute (Spring, Wetherall, Anderson) • Traceroute provides a way to measure from you out • 100s of traceroute servers have appeared to help debug connectivity problems • very limited functionality • => provide simple, instrumentation sandbox at many sites in the internet • TTL, MTU, BW, congestion, reordering • safe interpreter + network guardian to limit impact • individual and aggregate limits USITS PlanetLab

  42. Example: reverse trace • underlying debate: open, unauthenticated, community measurement infrastructure vs closed, engineered service • see also Princeton BGP multilateration UW Google USITS PlanetLab

  43. Ossified or brittle? • Scriptroute set of several alarms • Low bandwidth traffic to lots of ip addresses brought routers to a crawl • Lots of small TTLs but not exactly Traceroute packets... • isp installed filter blocking subnet at Harvard and sent notice to network administrator without human intervention • Is innovation still allowed? USITS PlanetLab

  44. NetBait Serendipity • Brent Chun built a simple http server on port 80 to explain what planetlab was about and to direct inquiries to planet-lab.org • It also logged requests • Sitting just outside the firewall of ~40 universities... • the worlds largest honey pot • the number of worm probes from compromized machines was shocking • imagine the the epidemiology • see netbait.planet-lab.org USITS PlanetLab

  45. One example • The monthly code-red cycle in the large? • What happened a little over a week ago? USITS PlanetLab

  46. No, not Iraq • A new voracious worm appeared and displaced the older Code Red USITS PlanetLab

  47. Netbait view of March USITS PlanetLab

  48. DHT Bakeoff • Proliferation of distributed hash tables, content-address networks, dist. object location was a primary driver for PlanetLab • chord, can, pastry, tapestry, Kademlia, viceroy, ... • map a large identifier (160 bits) to object by routing (in the overlay) to node responsible for that key • in presence of concurrent inserts, joins, fails, leaves, ... • Natural for the community to try resolve the many proposals • Common API to allow for benchmarking (Dabek etal, IPTPS) • Analytical Comparisons • Ratnasamy says “rings are good” • Empirical Comparisons USITS PlanetLab

  49. Rationalizing Structured P2P Overlays CFS PAST i3 SplitStream Bayeux OceanStore Tier 2 join, leave multicast, anycast publish, unpublish, sendToObj Tier 1 get, put, remove DHT CAST DOLR route(key, msg) +upcalls and id mgmt Key-based Routing Tier 0 USITS PlanetLab

  50. Empirical Comparison (Rhea, Roscoe,Kubi) • 79 PlanetLab nodes, 400 ids per node • Performed by Tapesty side USITS PlanetLab

More Related