610 likes | 632 Views
PlanetLab: an open community test-bed for Planetary-Scale Services a “work in progress” for USITS03. David Culler* UC Berkeley Intel Research @ Berkeley. with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, Frans Kaashoek, Mike Wawrzoniak,. PlanetLab today.
E N D
PlanetLab: an open community test-bed for Planetary-Scale Servicesa “work in progress” for USITS03 David Culler* UC Berkeley Intel Research @ Berkeley • with Larry Peterson, Tom Anderson, Mic Bowman, Timothy Roscoe, Brent Chun, • Frans Kaashoek, Mike Wawrzoniak, ....
PlanetLab today • 121 nodes at 52 sites in 10 countries, 4 continents, ... • Universities, Internet 2, co-lo’s soon • Active and growing research community • Just beginning... ... on way to 1,000 http://www.planet-lab.org USITS PlanetLab
Where did it come from? • Sense of wonder • what would be the next important thing to do in extreme networked systems post cluster, post yahoo, post inktomi, post akamai, post gnutella, post bubble? • Sense of angst • NRC: “looking over the fence at networks” • ossified internet (intelluctually, infrastructure, system) • next internet likely to emerge as overlay on current one (again) • it will be defined by its services, not its transport • Sense of excitement • new class of services & applns that spread over much of the web • CDN’s, P2P’ss just the tip of the iceberg • architectural concepts emerging • scalable translation, dist. storage, dist. events, instrumentation, caching, management USITS PlanetLab
key missing element – hands on experience • Researchers had no vehicle to try out their next n great ideas in this space • Lot’s of simulations • Lot’s of emulation on large clusters • emulab, millennium, modelnet • Lot’s of folks calling their 17 friends before the next deadline • RON testbed • but not the surprises and frustrations of experience at scale to drive innovation USITS PlanetLab
Guidelines (1) • Thousand viewpoints on “the cloud” is what matters • not the thousand servers • not the routers, per se • not the pipes USITS PlanetLab
Guidelines (2) • and you miust have the vantage points of the crossroads • primarily co-location centers USITS PlanetLab
Guidelines (3) • Each service needs an overlay covering many points • logically isolated • Many concurrent services and applications • must be able to slice nodes => VM per service • service has a slice across large subset • Must be able to run each service / app over long period to build meaningful workload • traffic capture/generator must be part of facility • Consensus on “a node” more important than “which node” USITS PlanetLab
Guidelines (4) • Test-lab as a whole must be up a lot • global remote administration and management • mission control • redundancy within • Each service will require its own remote management capability • Testlab nodes cannot “bring down” their site • generally not on main forwarding path • proxy path • must be able to extend overlay out to user nodes? • Relationship to firewalls and proxies is key Management, Management, Management USITS PlanetLab
Guidelines (5) • Storage has to be a part of it • edge nodes have significant capacity • Needs a basic well-managed capability • but growing to the seti@home model should be considered at some stage • may be essential for some services USITS PlanetLab
Confluence of Technologies • Cluster-based scalable distribution, remote execution, management, monitoring tools • UCB Millennium, OSCAR, ..., Utah Emulab, ... • CDNS and P2Ps • Gnutella, Kazaa, ... • Proxies routine • Virtual machines & Sandboxing • VMWare, Janos, Denali,... web-host slices (EnSim) • Overlay networks becoming ubiquitous • xBone, RON, Detour... Akamai, Digital Island, .... • Service Composition Frameworks • yahoo, ninja, .net, websphere, Eliza • Established internet ‘crossroads’ – colos • Web Services / Utility Computing • Authentication infrastructure (grid) • Packet processing (layer 7 switches, NATs, firewalls) • Internet instrumentation The Time is NOW USITS PlanetLab
March 02 “Underground Meeting” • Intel Research • David Culler • Timothy Roscoe • Sylvia Ratnasamy • Gaetano Borriello • Satya (CMU Srini) • Milan Milenkovic • Duke • Amin Vadat • Jeff Chase • Princeton • Larry Peterson • Randy Wang • Vivek Pai • Rice • Peter Druschel • Utah • Jay Lepreau • CMU • Srini Seshan • Hui Zhang • UCSD • Stefan Savage • Columbia • Andrew Campbell • ICIR • Scott Shenker • Eddie Kohler Washington Tom Anderson Steven Gribble David Wetherall MIT Frans Kaashoek Hari Balakrishnan Robert Morris David Anderson Berkeley Ion Stoica Joe Helerstein Eric Brewer Kubi USITS PlanetLab see http://www.cs.berkeley.edu/~culler/planetlab
Outcome • “Mirror of Dreams” project • K.I.S.S. • Building Blocks, not solutions • no big standards, OGSA-like, meta-hyper-supercomputer • Compromise • A basic working testbed in the hand is much better than “exactly my way” in the bush • “just give me a bunch of (virtual) machines spread around the planet,.. I’ll take it from there” • small distr. arch team, builders, users USITS PlanetLab
design deploy measure Tension of Dual Roles • Research testbed • run fixed-scope experiments • large set of geographically distributed machines • diverse & realistic network conditions • Deployment platform for novel services • run continuously • develop a user community that provides realistic workload USITS PlanetLab
YOU ARE HERE Overlapping Phases 2003 2004 2005 Build a working “sandbox” of significant scale quickly to catalyze the community. 0. seed I. get API & interfaces right II. get underlying arch. and impl. right USITS PlanetLab
Architecture principles • “Slices” as fundamental resource unit • distributed set of (virtual machine) resources • a service runs in a slice • resources allocated / limited per-slice (proc, bw, namespace) • Distributed Resource Control • host controls node, service producer, service consumers • Unbundled Management • provided by basic services (in slices) • instrumentation and monitoring a fundamental service • Application-Centric Interfaces • evolve from what people actually use • Self-obsolescence • everything we build should eventually be replaced by the community • initial centralized services only bootstrap distributed ones USITS PlanetLab
Slice-ability • Each service runs in a slice of PlanetLab • distributed set of resources (network of virtual machines) • allows services to run continuously • VM monitor on each node enforces slices • limits fraction of node resources consumed • limits portion of name spaces consumed • Challenges • global resource discovery • allocation and management • enforcing virtualization • security USITS PlanetLab
Unbundled Management • Partition management into orthogonal services • resource discovery • monitoring system health • topology management • manage user accounts and credentials • software distribution and updates • Approach • management services run in their own slice • allow competing alternatives • engineer for innovation (define minimal interfaces) USITS PlanetLab
Distributed Resource Control • At least two interested parties • service producers (researchers) • decide how their services are deployed over available nodes • service consumers (users) • decide what services run on their nodes • At least two contributing factors • fair slice allocation policy • both local and global components (see above) • knowledge about node state • freshest at the node itself USITS PlanetLab
Application-Centric Interfaces • Inherent problems • stable platform versus research into platforms • writing applications for temporary testbeds • integrating testbeds with desktop machines • Approach • adopt popular API (Linux) and evolve implementation • eventually separate isolation and application interfaces • provide generic “shim” library for desktops USITS PlanetLab
Service-Centric Virtualization USITS PlanetLab
Changing VM landscape • VMs for complete desktop env. re-emerging • e.g., VMware • extremely complete, poor scaling • VM sandboxes widely used for web hosting • ensim, BSD Jail, linux vservers (glunix, ufo, ...) • limited /bin, no /dev, many VMs per FM • limit the API for security • Scalable Isolation kernels (VMMs) • host multiple OS’s on cleaner VM • Denali, Xen • Simple enough to make secure • attack on hosted OS is isolated Savage/Anderson view: security is the most critical requirement, there has never been a truly secure VM, it can only be secure if it has no bugs... USITS PlanetLab
How much to virtualize? • enough to deploy the next planet-lab within a slice on the current one... • enough network access to build network gateways for overlays • Phase 0: unix process as VM • SILK (Scout in Linux Kernal) to provide resource metering, allocation • Phase 1: sandbox • evolved a constrained, secure API (subset) • Phase 2: small isolation kernel with narrow API • some services built on it directly • host linux / sandbox on top for legacy services USITS PlanetLab
Linux Service n Slivers of a Slice: long-term plan XP BSD Service 3 Service 4 Application Interface Service 1 Service 2 Isolation Interface Isolation Kernel • Denali • Xenoserver • VMWare Hardware USITS PlanetLab
Kickoff to catalyze community • Seeded 100 machines in 42 sites July 02 • avoid machine configuration issues • huge set of administrative concerns • Intel Research, Development, and Operations • UCB Rootstock build distribution tools • boot once from floppy to build local cluster • periodic and manual update with local modification • UCB Ganglia remote monitoring facility • aggregate stats from each site, pull into common database • 10 Slices (accounts) per site on all machines • authenticate principal (PIs), delegation of access • key pairs stored in PL central, PIs control which get pushed out • PIs map users to slices • Discovery by web pages • Basic SSH and scripts ... grad students roll what they need USITS PlanetLab
the meta-testbed effect • Emulab / netbed • boot-your-own OS doesn’t scale to unaffiliated site • architecture should permit it virtually • service lives in a slice • offers its own user mgmt, authentication, ... => need to offer virtual machine with virtual chroot ASAP • RON • need access to raw sockets to build gateways • need safe (restricted) access to raw sockets early • need mount • Hard to put a machine in someone else’s site and give out root. • Architecturally, should not need to do it. => pushed VServer and SILK agenda and ... federate without losing identity USITS PlanetLab
Vserver Vserver Vserver Service 1 Service 2 Service n Current Approach (on to phase I) Vserver Vserver Service 3 Service 4 Combined Isolation and Application Interface Linux + Resource Isolation + Safe Raw Sockets + Instrumentation Hardware + Ganglia, InforSpec, ScoutMonitor USITS PlanetLab
vServer experience (Brent Chun) • New set of scaling issues: disk footprint • 1581 directories, 28959 files • VM-specific copy-on-write reduced 29 MB/vm • copied part: 5.6 MB /etc, 18.6 MB /var • 1000 VMs per disk • Current • 222+ per node • 30-40 secs create, 10 secs delete • developing VM preallocate & cache • slice login -> vserver root • Limitations • common OS for all VMs (few calls for multiple OS’s) • user-level NFS mount (MIT’s on it) • incomplete self-virtualization • incomplete resource isolation (eg. buffer cache) • inperfect (but unbroken) kernel security => raised the bar on isolation kernels USITS PlanetLab
SILK (Princeton) • key elements of ANets NodeOS in linux • familiar API • Safe raw sockets • enables network gateways, application overlays • Monitoring • traffic per slice, per node • 5 min snapshots bytes sent/recv per slice x node • Isolation and limits • bandwidth • memory soon USITS PlanetLab
acquire ticket lease description candidates description ticket reserve description Dynamic Slice Creation N1 . . . N2 Agent Broker Service Manager N3 N4 . . . . . . Nm USITS PlanetLab
BootCD – enabling growth • Constrained linux booted from CD with networking • Knows how to phone home and get signed script • check signature and run • install • chain boot • reboot with special sshd • register first... • grow the testbed and use it too http://www.planet-lab.org/joining/ USITS PlanetLab
A typical day (1/28) USITS PlanetLab
Run up to SIGCOMM USITS PlanetLab
A Slice for a Month (Duke) bytes recv’d per day by nodes bytes sent per day by nodes USITS PlanetLab
So what are people doing? Ping! USITS PlanetLab
Really... • Internet Instrumentation • DHT – scalable lookup, location • Distributed Storage • User-level Multicast • Distributed CDN, Search, ... • and all of them are doing a lot of pinging, copying, and timing • key aspect of an overlay network is to estimate performance characteristics of each virtual link USITS PlanetLab
with the internet in the middle scp 4 MB to MIT, Rice, CIT confirm Padhye SIGCOMM98 83 machines, 11/1/02 Sean Rhea basis for DHT comparison 143 RON+PlanetLab Synthetic Coodinate c/o Frans Kaashoek USITS PlanetLab 110 machine, c/o Ion Stoica i3 weather service
Analysis of Tapestry (Ben Zhao) • 98 machines, 6-7 Tapestry nodes per machine, all node pairs • Ratio of end-to-end routing latency to shortest ping time between node • Ratio of object location to ping • 10,000 objects per node Median=31.5, 90th percentile=135 90th percentile=158 USITS PlanetLab
Towards an instrumentation service • every overlay, DHT, and multicast is measuring the internet in the middle • they do it in different ways • they do different things with the data • Can this be abstracted into a customizable instrumentation service? • Share common underlying measurements • Reduce ping, scp load • Grow down into the infrastructure USITS PlanetLab
Ossified or fragile? • One group forgot to turn off an experiment • after 2 weeks of router being pinged every 2 seconds, ISP contacted ISI and threatened to shut them down. • One group failed to initialize destination address and ports (and had many virtual nodes on each of many physical nodes) • worked OK when tested on a LAN • trashed flow-caches in routers • probably generated a lot of unreachable destination traffic • triggered port-scan alarms at ISPs (port 0) • n^2 probe packets trigger other alarms USITS PlanetLab
the Gaetano advice • for this to be successful, it will need the support of network and system administrators at all the sites... • it would be good to start by building tools that made their job easier USITS PlanetLab
ScriptRoute (Spring, Wetherall, Anderson) • Traceroute provides a way to measure from you out • 100s of traceroute servers have appeared to help debug connectivity problems • very limited functionality • => provide simple, instrumentation sandbox at many sites in the internet • TTL, MTU, BW, congestion, reordering • safe interpreter + network guardian to limit impact • individual and aggregate limits USITS PlanetLab
Example: reverse trace • underlying debate: open, unauthenticated, community measurement infrastructure vs closed, engineered service • see also Princeton BGP multilateration UW Google USITS PlanetLab
Ossified or brittle? • Scriptroute set of several alarms • Low bandwidth traffic to lots of ip addresses brought routers to a crawl • Lots of small TTLs but not exactly Traceroute packets... • isp installed filter blocking subnet at Harvard and sent notice to network administrator without human intervention • Is innovation still allowed? USITS PlanetLab
NetBait Serendipity • Brent Chun built a simple http server on port 80 to explain what planetlab was about and to direct inquiries to planet-lab.org • It also logged requests • Sitting just outside the firewall of ~40 universities... • the worlds largest honey pot • the number of worm probes from compromized machines was shocking • imagine the the epidemiology • see netbait.planet-lab.org USITS PlanetLab
One example • The monthly code-red cycle in the large? • What happened a little over a week ago? USITS PlanetLab
No, not Iraq • A new voracious worm appeared and displaced the older Code Red USITS PlanetLab
Netbait view of March USITS PlanetLab
DHT Bakeoff • Proliferation of distributed hash tables, content-address networks, dist. object location was a primary driver for PlanetLab • chord, can, pastry, tapestry, Kademlia, viceroy, ... • map a large identifier (160 bits) to object by routing (in the overlay) to node responsible for that key • in presence of concurrent inserts, joins, fails, leaves, ... • Natural for the community to try resolve the many proposals • Common API to allow for benchmarking (Dabek etal, IPTPS) • Analytical Comparisons • Ratnasamy says “rings are good” • Empirical Comparisons USITS PlanetLab
Rationalizing Structured P2P Overlays CFS PAST i3 SplitStream Bayeux OceanStore Tier 2 join, leave multicast, anycast publish, unpublish, sendToObj Tier 1 get, put, remove DHT CAST DOLR route(key, msg) +upcalls and id mgmt Key-based Routing Tier 0 USITS PlanetLab
Empirical Comparison (Rhea, Roscoe,Kubi) • 79 PlanetLab nodes, 400 ids per node • Performed by Tapesty side USITS PlanetLab