OneLab - PlanetLab Europe Presentation for Rescom 2007

OneLab - PlanetLab EuropePresentation for Rescom 2007 • Serge Fdida • Thomas Bourgeau • Laboratoire LIP6 – CNRS • Université Pierre et Marie Curie – Paris 6 • http://www.lip6.fr/rp

The OneLab Projectwww.one-lab.org • A Europe-wide project • European Commission funding under the FP6 funding program • STREP : IST-2006-034819 • A STREP (Specific Targeted Research Project) funded by the European Commission • start: September 2006, duration: 2 years • Funding 1.9 M€, Total budget : 2.86 M€ • Aims • Extend, deepen, and federate PlanetLab

The OneLab Consortium • Project leader • Université Pierre et Marie Curie (France) • Technical direction • INRIA (France) • Partners • Universitad Carlos III de Madrid (Spain) • Université Catholique de Louvain (Belgium) • Università di Napoli (Italia) • Università di Pisa (Italia) • Alcatel Italia (Italia) • Quantavis (Italia) • Telekomunikacja Polska (Polska) • France Telecom (France)

OneLab Goals • OneLab is a concrete path towards Experimental Facilities • Based on Planetlab • OneLab will help us better understand federation, which will be key to Experimental Facility success • OneLab will also make considerable progress in • Extending • Extend PlanetLab into new environments, beyond the traditional wired internet. • Deepening • Deepen PlanetLab’s monitoring capabilities. • Federating • Provide a European administration for PlanetLab nodes in Europe.

Outline • PlanetLab • OneLab • PlanetLab Europe • Practice

PlanetLab • An open platform for: • testing overlays, • deploying experimental services, • deploying commercial services, • developing the next generation of internet technologies. • A set of virtual machines • distributed virtualization • each of 350+ network services runs in its own slice

PlanetLab nodes Single PLC located at Princeton • 784 machines spanning 382 sites and 35+ countries • nodes within a LAN-hop of 2M+ users • Administration at Princeton University • Prof. Larry Peterson, six full-time systems administrators

Usage Stats • Slices: 350 - 425 • AS peers: 6000 • Users: 1028 • Bytes-per-day: 2 - 4 TB • Coral CDN represents about half of this • IP-flows-per-day: 190M • Unique IP-addrs-per-day: 1M • Experiments on PlanetLab figure in many papers at major networking conferences

Slices

User Opt-in Client Server NAT

Per-Node View Node Manager Local Admin VM1 VM2 VMn … Virtual Machine Monitor (VMM) VMM: Currently Linux with vserver extensions Could eventually be Xen

Per-Node View Local Admin VM1 VM2 VMn Node Manager … Virtual Machine Monitor (VMM) Kernel Hardware

PlanetFlow SliceStat pl_scs pl_mom SliverMgr Proper Linux kernel (Fedora Core) + Vservers (namespace isolation) + Schedulers (performance isolation) + VNET (network virtualization) Per-Node Mechanisms Node Mgr Owner VM VM1 VM2 VMn … Virtual Machine Monitor (VMM)

PlanetLab

Architecture (1) • Node Operating System • isolate slices • audit behavior • PlanetLab Central (PLC) • remotely manage nodes • bootstrap service to instantiate and control slices • Third-party Infrastructure Services • monitor slice/node health • discover available resources • create and configure a slice • resource allocation

PlanetLab Nodes Service Developers Owner 1 Owner 2 Owner 3 Owner N U S E R S Slice Authority Request a slice Create slices New slice ID Identify slice users (resolve abuse) Learn about nodes Auditing data Management Authority . . . Software updates . . . Access slice Architecture (2)

node database MA Owner VM NM + VMM Node Owner Service Developer VM SCS slice database SA Architecture (3) Node

Requirements • Global platform that supports both short-term experiments and long-running services. • services must be isolated from each other • performance isolation • name space isolation • multiple services must run concurrently Distributed Virtualization • each service runs in its own slice: a set of VMs

Requirements 2) Must convince sites to host nodes running code written by unknown researchers. • protect the Internet from PlanetLab Chain of Responsibility • explicit notion of responsibility • trace network activity to responsible party

Requirements 3) Federation • universal agreement on minimal core (narrow waist) • allow independent pieces to evolve independently • identify principals and trust relationships among them

Princeton Berkeley Washington MIT Brown CMU NYU ETH Harvard HP Labs Intel NEC Labs … princeton_codeen harvard_ice hplabs_donutlab paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … Trust Relationships Trusted Intermediary (PLC) N x N

2 4 3 1 Trust Relationships (cont) Service Developer (User) Node Owner PLC 1) PLC expresses trust in a user by issuing it credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vnet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure

4 6 2 Service Developer (User) Node Owner Mgmt Authority 3 5 1 Trust Relationships (cont) Slice Authority 1) PLC expresses trust in a user by issuing credentials to access a slice 2) Users trust to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure 5) MA trusts SA to reliably map slices to users 6) SA trusts MA to provide working VMs

VMM • Linux • significant mind-share • Vserver • scales to hundreds of VMs per node (12MB each) • Scheduling • CPU • fair share per slice (guarantees possible) • link bandwidth • fair share per slice • average rate limit: 1.5Mbps (24-hour bucket size) • peak rate limit: set by each site (100Mbps default) • disk • 5GB quota per slice (limit run-away log files) • memory • no limit

VMM (cont) • VNET • socket programs “just work” • including raw sockets • slices should be able to send only… • well-formed IP packets • to non-blacklisted hosts • slices should be able to receive only… • packets related to connections that they initiated (e.g., replies) • packets destined for bound ports (e.g., server requests) • essentially a switching firewall for sockets • leverages Linux's built-in connection tracking modules • also supports virtual devices • standard PF_PACKET behavior • used to connect to a “virtual ISP”

Long-Running Services • Content Distribution • CoDeeN: Princeton (serving > 1 TB of data per day) • Coral CDN: NYU • Cobweb: Cornell • Internet Measurement • ScriptRoute: Washington, Maryland • Anomaly Detection & Fault Diagnosis • PIER: Berkeley, Intel • PlanetSeer: Princeton • DHT • Bamboo (OpenDHT): Berkeley, Intel • Chord (DHash): MIT

Services (cont) • Routing • i3: Berkeley • Virtual ISP: Princeton • DNS • CoDNS: Princeton • CoDoNs: Cornell • Storage & Large File Transfer • LOCI: Tennessee • CoBlitz: Princeton • Shark: NYU • Multicast • End System Multicast: CMU • Tmesh: Michigan

Node Manager • SliverMgr • creates VM and sets resource allocations • interacts with… • bootstrap slice creation service (pl_scs) • third-party slice creation & brokerage services (using tickets) • Proper: PRivileged OPERations • grants unprivileged slices access to privileged info • effectively “pokes holes” in the namespace isolation • examples • files: open, get/set flags • directories: mount/unmount • sockets: create/bind • processes: fork/wait/kill

Auditing & Monitoring • PlanetFlow • logs every outbound IP flow on every node • accesses ulogd via Proper • retrieves packet headers, timestamps, context ids (batched) • used to audit traffic • aggregated and archived at PLC • SliceStat • has access to kernel-level / system-wide information • accesses /proc via Proper • used by global monitoring services • used to performance debug services

Infrastructure Services • Brokerage Services • Sirius: Georgia • Bellagio: UCSD, Harvard, Intel • Tycoon: HP • Environment Services • Stork: Arizona • Application Manager: MIT • Monitoring/Discovery Services • CoMon: Princeton • PsEPR: Intel • SWORD: Berkeley • IrisLog: Intel

Outline • PlanetLab • OneLab • PlanetLab Europe • Practice

OneLab Goals • Extend • Extend PlanetLab into new environments, beyond the traditional wired internet. • Deepen • Deepen PlanetLab’s monitoring capabilities. • Federate • Provide a European administration for PlanetLab nodes in Europe.

OneLab Workpackages • WP0 Management (UPMC) • WP1 Operations (UPMC, with INRIA, FT, ALA, TP) • WP2 Integration (INRIA, with UPMC) • WP3 Monitoring (IRC lead) • WP3A Passive monitoring (IRC) • WP3B Topology monitoring (UPMC) • WP4 New Environments (FT lead) • WP4A WiMAX component (UCL) • WP4B UMTS component (UniNa, with ALA) • WP4C Multihomed component (UC3M, with IRC) • WP4D Wireless ad hoc component (FT, with TP) • WP4E Emulation component (UniPi, with UPMC, INRIA) • WP5 Validation (UPMC, with all partners) • WP6 Dissemination (UniNa, with all partners save TP)

PlanetLab Today - A set of end-hosts - A limited view of the underlying network - Built on the wired internet

OneLab Vision for PlanetLab - Reveal the underlying network - Extend into new wired and wireless environments

Goal: Extend

Why Extend PlanetLab? • Problem: PlanetLab nodes are connected to the traditional wired internet. • They are mostly connected to high-performance networks such as Abilene, DANTE, NRENs. • These are not representative of the internet as a whole. • PlanetLab does not provide access to emerging environments.

OneLab’s New Environments • WiMAX (Université Catholique de Louvain) • Install two nodes connected via a commercial WiMAX provider • Nodes on trucks (constrained mobility) • UMTS (Università di Napoli, Alcatel Italia) • Nodes on a UMTS micro-cell run by Alcatel Italia • Wireless ad hoc networks (France Telecom at Lannion) • Nodes in a Wi-Fi mesh network (like ORBIT)

OneLab’s New Environments • Emulated (Università di Pisa) • For emerging wireless technologies • Based on dummynet • Multihomed (Universidad Carlos III de Madrid)

Progress on Extension • Added wireless capabilities to the kernel • Will enable nodes to attach via: WiMAX, UMTS, Wi-Fi • Implementing SHIM-6 multihoming • Nodes connected via IPv6 will be able to choose their paths • Incorporating Wi-Fi emulation into dummynet • Will allow experimentation in scenarios where deployment is difficult (other wireless technologies to follow)

Goal: Deepen Expose the underlying network

Why Deepen PlanetLab? • Problem: PlanetLab provides limited facilities to make applications aware of the underlying network • PlanetLab consists of end-hosts • Routing between nodes is controlled by the internet (This will change with GENI) • Applications must currently make their own measurements

OneLab Monitoring Components • Passive monitoring (Intel Research Cambridge) • Track packets at the routers • Use CoMo boxes placed within DANTE • Active monitoring (U. P. & M. Curie) • Provide a view of the route structure • Increase the scalability of wide distributed traceroute (traceroute@home) • Reduce traceroute deficiencies on load balanced path (paris traceroute) • BGP guided probing

Progress on Deepening • CoMo is now OneLab-aware, has better scripting • CoMo allows one to write scripts to track one’s own packets as they pass measurement boxes within the network • Deploying traceroute@home, a distributed topology-tracing system • Made fundamental improvements to traceroute to correct errors introduced by network load balancing (new tool: Paris traceroute)

Goal: Federate Before: a homogeneous system

Goal: Federate After: a heterogeneous set of systems

OneLab - PlanetLab Europe Presentation for Rescom 2007