170 likes | 447 Views
PlanetLab Architecture. Larry Peterson Princeton University. Need to define the PlanetLab Architecture. Issues. Multiple VM Types Linux vservers, Xen domains Federation EU, Japan, China Resource Allocation Policy, markets Infrastructure Services Delegation.
E N D
PlanetLab Architecture Larry Peterson Princeton University
Need to define the PlanetLab Architecture Issues • Multiple VM Types • Linux vservers, Xen domains • Federation • EU, Japan, China • Resource Allocation • Policy, markets • Infrastructure Services • Delegation
Key Architectural Ideas • Distributed virtualization • slice = set of virtual machines • Unbundled management • infrastructure services run in their own slice • Chain of responsibility • account for behavior of third-party software • manage trust relationships
Princeton Berkeley Washington MIT Brown CMU NYU ETH Harvard HP Labs Intel NEC Labs Purdue UCSD SICS Cambridge Cornell … princeton_codeen nyu_d cornell_beehive att_mcash cmu_esm harvard_ice hplabs_donutlab idsl_psepr irb_phi paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … Trust Relationships Trusted Intermediary (PLC) N x N
Principals • Node Owners • host one or more nodes (retain ultimate control) • selects an MA and approves of one or more SAs • Service Providers (Developers) • implements and deploys network services • responsible for the service’s behavior • Management Authority (MA) • installs an maintains software on nodes • creates VMs and monitors their behavior • Slice Authority (SA) • registers service providers • creates slices and binds them to responsible provider
1 4 2 3 Trust Relationships (1) Owner trusts MA to map network activity to responsible slice MA (2) Owner trusts SA to map slice to responsible providers 6 (3) Provider trusts SA to create VMs on its behalf Owner Provider (4) Provider trusts MA to provide working VMs & not falsely accuse it (5) SA trusts provider to deploy responsible services (6) MA trusts owner to keep nodes physically secure 5 SA
node database MA Owner VM NM + VMM Node Owner Service Provider VM SCS slice database SA Architectural Elements
Narrow Waist • Name space for slices < slice_authority, slice_name > • Node Manager Interface rspec = < vm_type = linux_vserver, cpu_share = 32, mem_limit - 128MB, disk_quota = 5GB, base_rate = 1Kbps, burst_rate = 100Mbps, sustained_rate = 1.5Mbps >
Node Boot/Install Process Node Boot Manager PLC Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config . from floppy 4. Contact PLC (MA) 5. Send boot manager 6. Execute boot mgr 7. Node key read into memory from floppy 8. Invoke Boot API 9. Verify node key, send current node state 10. State = “install”, run installer 11. Update node state via Boot API 12. Verify node key, change state to “boot” 13. Chain-boot node (no restart) 14. Node booted
PlanetFlow • Logs every outbound IP flow on every node • accesses ulogd via Proper • retrieves packet headers, timestamps, context ids (batched) • used to audit traffic • Aggregated and archived at PLC
Network Activity Slice Responsible Users & PI Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) PI creates slice and assigns users to it (logged) Slice Created Nodes Added to Slices Users add nodes to their slice (logged) Slice Traffic Logged Experiments run on nodes and generate traffic (logged by Netflow) Traffic Logs Centrally Stored PLC periodically pulls traffic logs from nodes
PI SliceCreate( ) SliceUsersAdd( ) User SliceNodesAdd( ) SliceAttributeSet( ) SliceInstantiate( ) VM SliceGetAll( ) … slices.xml Slice Creation . . . PLC (SA) NM VM VM VMM . . .
PI SliceCreate( ) SliceUsersAdd( ) User SliceAttributeSet( ) SliceGetTicket( ) SliverCreate(ticket) VM … (distribute ticket to slice creation service) Slice Creation . . . PLC (SA) NM VM VM VMM . . .
PI SliceCreate( ) SliceUsersAdd( ) Broker SliceAttributeSet( ) SliceGetTicket( ) rcap = PoolCreate(ticket) VM … (distribute ticket to brokerage service) Brokerage Service . . . PLC (SA) NM VM VM VMM . . .
PoolSplit(rcap, slice, rspec) VM … User BuyResources( ) (broker contacts relevant nodes) Brokerage Service (cont) . . . PLC (SA) NM VM VM VM VMM Broker . . .
Policy Proposals • Suspend a site’s slices while a its nodes are down • Resource allocation to • brokerage services • long-running services • Encourage measurement experiments via ScriptRoute • lower scheduling latency for select slices • Distinguish PL versus non-PL traffic • remove per-node burst limits • replace with sustained rate caps • limit slices to 5GB/day to non-PL destinations (with exceptions)