470 likes | 600 Views
Diversifying the Network Edge. Fred Kuhns. Host and LAN Support for Network Diversification. Motivation: solution to network ossification: difficult to field new protocols or technologies which address limitations in current data networks
E N D
Diversifying the Network Edge Fred Kuhns
Host and LAN Support for Network Diversification • Motivation: • solution to network ossification: difficult to field new protocols or technologies which address limitations in current data networks • create common substrate layer over which new networking protocols, services and technologies may be deployed • common substrate layer provides virtualized links, routers and end systems. • key issue is how to realize virtualization with isolation at the network edge (LAN and end system)
Introduction - Diversified networking at the edge • Isolating vNet traffic in the LAN • Define substrate packet format and protocol • Reserve portion of LAN bandwidth for vNET traffic • Determining topology and available bandwidth • Establish substrate links between send systems and substrate routers • Establishing virtual links between virtual end systems and associated virtual routers • Mechanisms for realizing BW reservation • Isolating vNet traffic in the end system • Supporting the common substrate layer: managing the network resource • network interface access control • bandwidth allocation and enforcement • delivering to neighbor • management and control interface (accounting, configuration) • OS extensions to support new networking protocol instantiation and isolation • specifying and enforcing isolation • maintain kernel integrity • required safety and liveness properties of protocols • mechanisms to guard against ill-behaved vNet protocol instances: due to unsafe behavior (bugs, malicious) or excessive resource use • mechanisms for protocol developers to use for enforcing safety/security • optimizing performance • software development environment for protocols (TBD) • User versus kernel space protocol implementation and necessary kernel support
substrate link substrate router virtual router virtual link virtual end-system Context: Network Diversification (vNets)
substrate link substrate router virtual router virtual link virtual end-system
Concepts • Intranet versus Internet: • intranet (no routing) use existing model and protocols • internet (routing) use diversified networking model • Diversified Networking Model: • multiple networks coexisting within common infrastructure (virtual networks or vNets) • each distinct network instance operates as though it has dedicated resources (non-interfering) • vNet specific routers (virtual routers) interconnected through simplex, point-to-point links (virtual links) • common substrate layer used for delivering vNet packets to neighbor (provides a simple wire-like service) • Current model: • Dominant networking protocol: IP • Shared, heterogeneous physical networks (ATM, Ethernet, Frame Relay, wireless, packet over SONET, etc.) • Links interconnecting packet switches • Interconnection links may be tunneled (Link Virtualization) through intermediate devices: ATM, Packet over SONET (or PPP-over-X), MPLS. • Challenges at the network edge: • partition LAN into virtual links and access routers • end-system support for virtual networks • isolation mechanisms for virtualized resources • bind virtualized resources to network instances
Terminology • Network Diversification: • Virtual Network (vNet): distinct vNets coexist within a common physical network • Diversification layer: common substrate layer, provides isolation and point-to-point link services • vNet is composed of one or more virtual routers (VR) interconnected by virtual links. Virtual routers and links are direct corollaries to their physical counterparts … Network resources are virtualized. • An end-system implements vNet protocols and provides connectivity services within a virtualized network protocol environment (virtual end-system). The virtual end-system provides mechanisms for protocol implementation, resource control and isolation. • Diversification layer provides two levels of abstraction (i.e. two core services): • Substrate: encapsulate existing layer 1 and layer 2 technologies and provide a single, consistent framework for implementing virtualized links and routers. substrate link: abstraction to provide similar behavior as a point-to-point connection between communicating end points. Provides isolation services to different virtual networks using a common substrate link. substrate router: A physical device which forwards network traffic based on its vNet membership. Provides sharing and isolation services to disparate vNets and hosts virtual routers. • Virtual: framework providing a simple model and set of interfaces for implementing virtual networks. The model defines virtual routers, end-systems and links. The goal is for virtual inks to and routers to behave similar to their physical counterparts. virtual link: simulates the behavior of a dedicted point-to-point link interconnecting virtual end points (virtual routers and/or virtual end systems). A virtual link is implemented by one or more substrate links. virtual router: implements a particular vNet’s routing logic. The underlying substrate router provides the necessary isolation and resource management functions.
Related Work • Current virtualization efforts on the end system are driven by the desire to support many concurrently running, non-interfering, secure server applications • The goal is to completely isolate applications running on a common hardware platform. It appears to each application as though it is running on a dedicated platform (hardware and operating system). • The framework enforces resource constraints and access controls • In this model the isolation is complete and transparent • each operational environment appears as a complete end system with independent operating system instances. • however this is too course grained for our purposes where we want to support multiple networks per OS instance. • Mention VMWare, Xen, Denali • Network Protocol extension or composition: • xkernel, spin • Protocol development environment and patterns • ??? • Extensible Operating Systems (loading extensions into kernel), see next slide
Related Work: Extensible Operating Sytstems • Issues: • safety, liveness, performance • Techniques: • Safe Execution Environment/Virtual machines: Java, KoffeOS, packet filters • Language based (type safety): OKE, mobile code (STP), SPIN, • Proofs: proof carrying code (PCC) • Software Fault Isolation (SFI): VINO • Hardware Fault Isolation (HFI): kernel plugins, Denali, XEN, Exokernel, Palladium, NOOKS. See VMM next page. • we focus on two approaches: • kernel extension to support simple interpreted environment (packet filtering) with protocols implemented in user space • sandbox for in-kernel protocol implementations using a type safe language and run-time support. In the sprit of OKE and mobile code (with concepts from OKE
Modeling the LAN Environment • The effort to provide a simple, common infrastructure layer for creating new or specialized networks has parallels in operating system and middleware research. Both attempt to offer two key services[1]: • Resource management: time and space sharing (multiplex resources); synchronization and deadlock handling (buffers, link access, link BW, non-preempted transmission of packet); accounting and status • User friendliness: convenient and consistent operational environment (see the many RFCs); error detection and handling; protection and security; fault tolerance and failure recovery. [1] Singhal, Shivaratri, Advanced Concepts in Operating Systems, McGraw-Hill, 1994 • A core technique is to export an extended, virtualized machine providing the illusion of dedicated resources (though the level of abstraction and degree of virtualization differ between systems) • extended machine: abstraction to deal with complexity • virtual machine: controlled sharing • Define an administrative entity to represent clients (of the service). • For example, operating systems define processes to represent resource ownership and protection domains. An IP network may define flows, or flow aggregates, to represent an abstract client to which resources (buffers and bandwidth) are assigned.
LAN Virtualization • Goal: enable unrelated entities, vNets, to transparently share a common set of underlying resources. Similar to how processes transparently share the underlying computer platform. • Abstract Resources (to create the extended Net): links, routers, end-systems • Virtualization (make the virtual resource behave as through they were real, physical devices): • End-system: network subsystem interface, protocol implementation, device interface for point-to-point links • LAN: links, switches and buffers • WAN/MAN: LANS, packet switches (beyond scope of this ppt) • We would like to virtualize LAN resources such that registered vNets and local traffic are isolated. • As an example we consider an Ethernet LAN: We can realize this with Ethernet and IEEE standards 802.1P/Q (VLANs and Priorities): • star topology • tree topology • layered tree (with priorities) • If a virtual link must pass through an existing IP router (the vNet router is not directly attached to the same LAN) then tunnels may be used: IPIP, GRE, MPLS etc.
… VLANX1 VLANX2 VLANXN switched LAN vNetX VR1 Simulates Star Topology for Substrate Links • Internetworking over a diversified network • Substrate function with Ethernet: • Substrate links: use VLANs to provide the equivalent of a virtualized “wire” connecting an endsystem to a specific substrate router. • Sharing and Isolation: • All vNet traffic use assigned VLANs • Use priority queuing (802.1P/Q) • All intranet traffic uses lower priority queues. • Resource management: • LAN: Use admission control (static or dynamic) to provide bandwidth guarantees to vNet traffic. • End system: Substrate layer on end-system enforce per VLAN and per vNet bandwidth constraints • Virtual links: In this simple example there is exactly one virtual link for each substrate link. • Each host to substrate router connection is assigned a distinct VLAN. So N hosts implies N VLANs on Ethernet. • Alternative is to define one VLAN tree for each protocol suite (i.e. vnet).
Low High Low High Low High Low High Traffic isolation with priority aware substrate … Ethernet Hub with High and Low Priority TX queues vNet traffic to High otherwise Low Local control/management; Legacy internet traffic all vNet traffic vNet traffic (internet) vNetX VR1 Local traffic (intranet)
ethernet switched LAN Substrate Link as a VLAN Tree … • Internetworking over a diversified network • Substrate function with Ethernet: • Substrate links: The VLAN creates a tree interconnecting all end-systems to the substrate router. Substrate end-point then uses the VLAN tag and source/destination address to realize the logical point-to-point substrate link. • Sharing and Isolation: • no change from substrate star topology. The only difference is the shared VLAN domain. Scheme provides traffic isolation. • Resource management: • Same • Virtual links: Same. VLANX
… … switched LAN switched LAN VLANdgram VLANhigh VLANmed VLANX
ethernet switched LAN Multiple Substrate Links … • Internetworking over a diversified network • Substrate function with Ethernet: • Substrate links: Three VLAN trees are used for all virtual net traffic to/from a substrate router: • Low priority: default for best-effort traffic • Medium priority for virtual nets with soft performance requirements (average bandwidth) • High priority for isochronous or low-delay, interactive applications • Sharing and Isolation: See above. • Resource management: See above • Virtual links: Same. VLANdgram VLANhigh VLANmed
VR1 VLI VLI VLI VR1 VLI VLI Multiple vNets per Host … virtual interface ether addr/vlan ether addr/vlan ether addr/vlan substrate interface VLAN1 VLAN2 VLAN3 • The full model: • Substrate link: connects end-system to substrate router. Virtualization of a physical cable or wire. A packet enters one end, exists the other and is opaque within. • Simplex or Duplex? • Substrate interface: end-system abstraction • Ethernet: <interface, VLAN, dst_addr> • tunnel: MPLS, IP, IPsec, L2TPv3, GRE, AToM • Layer 2: ATM, others? • Virtual link: Logical interconnection (virtual wire) of adjacent vNet nodes. • Point-to-point, Simplex or Duplex? • Virtual interface: end-system abstraction representing one end of a virtual link. Substrate defines mechanism for multiplexing onto common substrate link. For example a virtual link identifier (VLI) in a substrate header • Simplex or Duplex? ethernet LAN VLAN tag and dst addr identify substrate router. VLI tag used to router pkt substrate interfaces virtual interface
vNet1 Ethernet LAN vNet2 vNet3 VLI VLI VLI VLI VLI … substrate interface ether addr/vlan SL1 SL2 SL3 substrate interfaces SR1 virtual interface VR1 VR1 SR2 SR3 VR VR VR
SR1 substrate interfaces VR VR VR VR SR2 SR3 SR4 virtual interface vNet1 vNet2 vNet3 VLI VLI VLI VR VR SR5 SR6 VR VR VR VLI VLI
Multiple next hop VRs vNetX VR2 vNetX VR3 Host A member of vNetX and vNetY substrate router 2 substrate router 3 VLI1 VLI1 enetAddrSR2 enetAddrSR3 enetAddrA VLANA3 VLANA2 • Multiple Next Hop Virtual Routers: • Substrate link: per end-system, substrate router pair. • Substrate interface: three substrate interfaces: • SI1 = <eth0, VLANXA1, enetAddrSR1> • SI2 = <eth0, VLANXA2, enetAddrSR2> • SI3 = <eth0, VLANXA3, enetAddrSR3> • Virtual link: Logical point-to-point connection between virtual end-system and access virtual router. Since we model a point-to-point link there is no need for link addresses. • Virtual interface: Representation of virtual link on the end-system. The substrate assigns a per substrate link, virtual link identifier (VLI) for each virtual link. • VI1 = <SI1, VLI1> • VI2 = <SI1, VLI2> • VI3 = <SI2, VLI1> • VI4 = <SI3, VLI1> ethernet switched LAN VLANA1 enetAddrSR1 VLI1 VLI2 vNetX VR1 vNetY VR1 substrate router 1
vNetX VR3 Host A member of vNetX and vNetY VLI1 vNetX VR2 SR2 SR3 VLI1 enetAddrA enetAddrSR2 enetAddrSR3 VLANA3 VLANA2 ethernet switched LAN VLANA1 SR1 enetAddrSR1 VLI1 VLI2 vNetX VR1 vNetY VR1
VLI VLI VLAN VLI TCP/IP as an Example Protocol vNet Protocl = IP vNet framework vint0 eth0 standard ethernet Interface … VLANX eth0 direct connect ethernet device VLANX ethernet LAN Substrate Interface: Directly connected: destination IP address + ARP = enet addr Gateway: (Gateway’s IP + ARP = enet addr) + VLAN Virtual Interface: Directly connected: Not used, model only for internetworking Gateway: VLI assigned by substrate. How is this integrated into the current ARP/route interface? ethernet dest. addr Substrate Router SR1 IP
Using Tunnels for the substrate layer • Need to look into the various tunneling approaches/protocols. How can we leverage these? • MPLS and MPLS VPNs • Generic Routing Encapsulation (GRE): RFC 2784 • Point-to-point tunneling protocol (PPTP) • Secure VPN • Any transport over MPLS (AToM) • IP tunnel • IPsec VPNs • Layer 2 Tunneling Protocol version 3 (L2TPv3) • version3 is a draft standard • RFC 2661: Layer 2 tunneling protocol • 802.1Q Tunneling: Cisco 802.1Q-in-Q VLAN Extension Services • What about MPLS over IP tunnels: what was done there?
Supporting Diversified Networking on the End System • vNet framework • substrate layer design and implementation on end system. Policies. • integration with existing networking subsystem and isolation mechanisms • packet processing and forwarding rules for both substrate and diversified networking layer. Includes address resolution rules and techniques. • how do we coordinate substrate and vNet link establishment? VLAN label assignments, substrate router address (IP? ethernet?), VLI assignments? • establishing links and assigning identifier and integrating with existing network infrastructure/tables. • controlling bandwidth allocations and link access • Supporting the common substrate layer: managing the network resource • what accounting functions are needed? • What control interface is exported? • OS extensions to support new networking protocol instantiation and isolation • to what degree do we “protect” the kernel? • buggy code? malicious code? the more protection the greater the performance hit. • specifying and enforcing isolation • performance: interface access and bandwidth; CPU; buffer (buffer hoarding) • kernel integrity: corrupt data structs, exceptions, unauthorized access, improper interface use, other safety issues • other vNet protocol instance integrity: vNet instance may be able to corrupt another module but not the kernel • do we attempt to monitor network traffic to ensure one vNet instance is not masquerading as another? Or other types of abuse? • techniques to require/enforce safety and liveness properties of protocols – or to detect violations (prevent or recover) • type-safe compiler and run-time checks • hardware fault isolation • software fault isolation • cross our fingers • optimizing performance • for user space protocols use safe execution environment for interpreting packet filters • for kernel space protocols use ??? • Software development environment for protocols • utility libraries and wrappers • patterns and OO models • compositional techniques or even automation
Background: Traditional Commodity OS Environments • Traditional general purpose operating system • Process model (resource ownership and execution context): associate programs with resource usage (allocation, scheduling, access control, synchronization) and accounting (historical data). • Isolation and accounting falls on this process boundary (or possibly thread) • the process model as implemented is not good at capturing resource usage resulting from hidden scheduling (kernel performing work for a process asynchronously such as when network packets are received) • likewise, the trust model assume the OS kernel is trustworthy which may not be true for dynamically extensible systems • the virtualization and scheduling of the CPU and memory is well developed (out of necessity) however managing I/O access and bandwidth is a more recent concern • With the increasing importance of networking and multimedia new techniques have been developed to manage I/O access and bandwidth • Network transmit bandwidth is typically managed with the use of packet classifiers (map packet to flow or flow aggregate) and queuing disciplines. This allocation and accounting model differs from the process centric model. • Disk I/O scheduling shifted from simply optimizing overall throughput to ensuring time critical operations completed on time. • For the majority of desktop systems network bandwidth is not a limiting factor (1Gbps interfaces are common on new systems). Rather memory and disks remain the critical performance bottleneck. • Much research and design has been directed at managing either per process or per Flow (or flow aggregate) I/O usage. Neither is the correct approach for this effort were we want per vNet resource management.
File Interface ops TCP module … TCP1 TCP2 TCPn FS management Basic I/O Interface RAW IP UDP open files buffer cache tasks device driver txqueue rxqueue OS Kernel Block Diagram User Space (Applications) Socket Interface ops AST Processing callback routes IP task management SW int (AST) util TCP TC/ AST poll qdisc scheduler callout Q hardware independent layer clock handler process accounting scheduling time management Device independent I/O core ethernet Interrupt Processing hardware dependent layer configuration: registers, MMU (TLB, cache, VM) bus and peripherals System Exception handlers ethernet device driver uart timer OS ISR demux Hardware HW interrupt/Exception
End-System Support for Network Diversification • What needs to change? • Process model: (Applications and programs need not change): No • process model is sufficient for application isolation • Trust model (is network subsystem in trusted?):Yes • current trust model is not good: need to dynamically load/unload new protocols which may not be trusted. Even user space applications will require mechanisms in the kernel to ensure non-interference • Resource Management for the Network Subsystem: Yes • Network subsystem degree of isolation is not longer adequate. vNet protocols must be separately contained, isolated, identifiable, preemptable and cancelable. • Network and processor usage accounting is not adequate. We need to keep track of per vNet resource usage and constraints. asynchronous network events (aka hidden scheduling) must be properly accounted for and scheduled (per vNet basis). • User friendliness (for the Network Subsystem - vNets): Yes • Provide simple mechanisms for adding, removing new vNet protocol instances. • Convenient environment for implementing, testing and debugging new protocols. • Support per vNet protection boundaries • mechanisms for implementing different security policies both within a given vNet and between different vNets. • Ensure system as a whole is not adversely impacted by faulted or poorly implemented protocols
Virtual End System • Comments and assumptions • assume that the creation/deletion of new vNets is infrequent • an application may open connections on one or more different vNets • unrelated applications must be able to engage in IPC using any available mechanism (pipes, shared memory, TCP/IP etc) • continue support for IP. In fact, IP can be considered to be the least common denominator network instance. We could use the existing IP network for control to establish and/or manage vNets. • support both user and kernel space protocol instances • provide isolation and resource guarantees on a per vNet basis • poorly behaved protocol instances (for a given vNet) will be detected, stopped and expelled from an end system. Applications using this protocol stack will be informed via a socket error return value. • intra-VN, implementers should have the mechanisms to support QoS and Security – what are they? • simple mechanism for adding new protocols/VNs
vNet1 vNet2 vNet3 TCP/IP protocol stack vNet Framework vNet mux/demux Block Diagram application proto mux/demux network device
application vNet1 vNet2 vNet3 TCP/IP protocol stack vNet Framework vNet mux/demux proto mux/demux network device
User or kernel Space protocols? • Each has pros and cons • User space protocols: • easier to implement and debug • easier to introduce new protocols (not tightly dependent on socket layer knowing about the new protocol) • easier to isolate and protect protocols and apps from each other (leverage process model) • kernel level protocols • easier to integrate into existing framework (simplifies support for system interface functions like select/poll) • simplifies intra-protocol security and protection (since protocol runs within trusted kernel) • simplifies (well, more direct) kernel demultiplexing to correct protocol context (endpoint) • increased efficiency
User Space Protocol Implementation • Uncommon outside of high-performance community, they want zero-copy and specialized demux keys. • Problems: asynchronous processing, life cycle, authentication and demultiplexing to endpoints • latency in delivering packets (i.e. acks) to user space • increased overhead in per packet processing before a drop/keep decision is made • processing received acks • timeouts and retransmissions • establishing connections and security: snooping, masquerading • supporting select and poll • protocols where connection may outlive process (TCP’s TIMED_WAIT) • global routing and address resolution tables • global connection tables • need to know what other ports are being used (locally) • accepting/rejecting new connections
user-space protocols: Global Issues • Routing: Direct packets to/from correct endpoint/interface • How is traffic demultiplexed and sent to the correct endpoint/process? • In-kernel filters • Where are the routing tables and how are they maintained? • route fixed when connection established or located in shared memory • Control: I use IPv4 as an example • Address resolution protocols/tables? • Other control protocols. For example ICMP, IGRP, others? • Where are the routing protocols implemented? • Management: • Must manage a protocols namespace (for example, port numbers in IPv4). • Common programming technique, allow protocol instance to select local address part • specify port = 0 and addr = 0 then implementation will assign correct values • Passive connect model? • In IPv4 a server listens on a port (host:port:proto) for a connection request. To establish a connection a unique (to the endsystem) port number is assigned and new socket allocated. • socket-oriented system calls must be supported. On UNIX must support non-blocking I/O with select and poll. • Connection lifetime may outlast process. • For example TCP TIME_WAIT or simply waiting for a final ack or resending if no ack received. • Security: we must provide sufficient mechanisms for protocol developers • implementations must be able to guard against masquerading and eavesdropping
User Space: Configurations • Given these global issues there are two likely configurations: • all traffic passes through common protocol daemon in user space • control daemon implements basic set of control functions while user library implements majority of data path functions • prior work has shown the latter approach to be superior. • Having all traffic pass through a common protocol daemon => at least one extra copy operation (kernel -> daemon -> user process) • A better solution is for a daemon to insert relatively simple packet filters in kernel for established connections which directs packets to/filters packets from endpoints.
application vnetX: protocol library User-Space: Passive Open 0. listen/accept (passive open) vnetX control daemon: (namespace, lifecycle, connections) 4. new connection data copy socket layer 3. insert incoming and outgoing filters for vnetX connection 1. connection request (in) 5. data, established connections compare against connection specific outgoing filter 2. ack (out) vnet demux connection filters use VLI to access incoming filters and use to demux to filter set and/or socket. ethernet
application vnetX: protocol library User-Space: Active Open 0. connect vnetX control daemon: (namespace, lifecycle, connections) 4. new connection data copy socket layer 1. connection request (out) 3. insert incoming and outgoing filters for vnetX connection 5. data, established connections compare against connection specific outgoing filter 2. ack (in) vnet demux connection filters use VLI to access incoming filters and use to demux to filter set and/or socket. ethernet
application vnetX: protocol library User-Space: Datagram (Connectionless) daemon fills in local address and binds to socket. No restrictions on destination 0. open(any) vnetX control daemon: (namespace, lifecycle, connections) data copy 2. new connection (local address) socket layer 1. insert incoming and outgoing filters for vnetX connection 3. data established connections compare against “connection” specific outgoing filter vnet demux connection filters use VLI to access incoming filters and use to demux to socket. In this case only the local part is used. ethernet
application vnetX: protocol library User-Space: Datagram (Connectionless) daemon fills in both local and destination addresses. Destination restricted 0. open(local and remote addr) vnetX control daemon: (namespace, lifecycle, connections) 2. new connection(local and remote) data copy socket layer 1. insert incoming and outgoing filters for vnetX connection 3. data established connections compare against “connection” specific outgoing filter vnet demux connection filters ethernet use VLI to access incoming filters and use to demux to socket.
application vnetX: protocol library User-Space: App exits TCP enters TIME_WAIT after close vnetX control daemon: (namespace, lifecycle, connections) socket layer 3. remove filters 1. connection close (out) 2. ack (in/out) vnet demux connection filters ethernet drop
Considerations For Kernel Extensions • Identified areas where modules may impact system behavior • software bugs (implementation errors) which may result in kernel or another vNet protocol stack to becoming corrupted. • dereference invalid pointer: corrupt kernel memory, cause exception (invalid address), read invalid data • incorrect parameter usage • indexing beyond end of an array • incorrect locking protocol or deadlock • overflowing stack (large local variables, recursion etc) • memory management errors: using freed memory, memory leaks, incorrect allocation sizes • not checking return values • design errors leading to kernel corruption • misuse of kernel interfaces • improper control processing • improper data output • performance/efficiency errors: use too many resources (buffers, I/O bandwidth, CPU cycles, locks, time) • adversely impacts kernel and application processes • adversely impacts other vNet protocol stacks • adversely impacts network traffic (remote hosts or network devices) • security or protection violation either compromising confidentiality or altering data • unauthorized read/write of kernel/user data • unauthorized use or resource (invalid packets set on network) • unauthorized read/write on another vNet protocol stack environment • possible Isolation mechanisms: • static and dynamic enforcement of kernel module (interface) access restrictions • Bounded (deterministic or limited) • buffers: common buffer pool but thresholds on number that can be in use at any one time. Easy for tx, what about receive (do we drop packets)? • Bandwidth • Locks?? • other resources? • hard/soft bounds? Deterministic or Statistical? • ???
Pushing protocols into the Kernel • Positives: • All the issues associated with user-space protocol simply go away. Global tables and lifetime of the kernel • Performance, efficiency, existing code base • Enhances intra-Protocol security • Simplifies integration with existing network I/O subsystems and interfaces • Negatives: • Isolation: More difficult to isolate system from protocol instances. Inter-protocol isolation difficult. • Security: Proving trust/security more difficult • Implementation and debugging more difficult in kernel
Our Approach • ???
ops File Interface PF_VNET PF_INET FS management Socket Interface I/O Interface open files buffer cache Socket I/O Interface vnet ops vnet Proto vnet Proto state tables state tables ethetnet vnet Demux eth device driver eth0 VLAN Kernel-Space Protocols Rework! Application(s) /dev/protoX /dev/vnet User Space (Applications) … vnet:ep vnet:ep tcp:port udp:port rawIP … TCP vnet RAW IP UDP TCP1 TCP2 … TCPn TCP/IP … IP route to interface routes SW Interrupt HW Interrupt Hardware HW interrupt/Exception
User Space Protocols • Chandramohan A. Thekkath , Thu D. Nguyen , Evelyn Moy , Edward D. Lazowska, Implementing network protocols at user level, IEEE/ACM Transactions on Networking (TON), v.1 n.5, p.554-565, Oct. 1993 • Chris Maeda, Brian Bershad, Protocol Service Decomposition for High-Performance Networking, Proceedings of the 14th ACM Symposium on Operating Systems Principles. December 1993, pp. 244-255. • Aled Edwards , Steve Muir, Experiences implementing a high performance TCP in user-space, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.196-205, 1995 • Kieran Mansley, Engineering a User-Level TCP for the CLAN Network, Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications, Pages: 228 – 236, 2003
Extensible protocol frameworks in the kernel • Parveen Patel, Andrew Whitaker, David Wetherall, Jay Lepreau, Tim Stack, Upgrading Transport Protocols using Untrusted Mobile Code, Proceedings of the 19th ACM Symposium on Operating Systems Principles, Pages 1-14, October 2003. • Herbert Bos, Bart Samwel, Safe Kernel Programming in the OKE, Proceedings of the fifth IEEE Conference on Open Architectures and Network Programming, June 2002 • Marc Fiuczynski, Brian Bershad, An Extensible Protocol Architecture for Application-Specific Networking, Proceedings of the Winter USENIX Technical Conference, pages 55-64, January, 1996 • Norman Hutchinson, Larry Peterson, The x-kernel: An Architecture for Implementing Network Protocols, IEEE Transactions on Software Engineering, 17(1):64-76, January 1991
Isolation Services • Marko Zec, Implementing a Clonable Network Stack In the FreeBSD Kernel, Proceedings of USENIX Technical Conference, pages 137-150, June 9-14, 2003 • P. H. Kamp, R. N. M. Watson, Jails: Confining the omnipotent root, Proceedings of the 2nd International SANE Conference, May 2000 • A Bavier, M Bowman, B Chun, D Culler, S Karlin, S Muir, L Peterson, T Roscoe, T Spalink, M Wawrzoniak, Operating System Support for Planetary-Scale Network Services, Proceedings of the 1st USENIX Symposium on Networked Systems Design and Implementation, pages 253-266, March 2004 • G Back, W Hsieh, J. Lepreau, Processes in KaffeOS: Isolation, Resource Management, and Sharing in Java, Proceedings of the 4th Symposium on Operating Systems Design and Implementation, pages 333-346, October 2000 • R Wahbe, S Lucco, T Anderson, S Graham, Efficient Software-Based Fault Isolation, Proceedings of the 14th Symposium on Operating Systems Principles, pages 203-216, December 5-8, 1993
VMM • P Barham, B Dragovic, K Fraser, S Hand, T Harris, A Ho, R Neugebauer, I Pratt, A Warfield, Xen and the Art of Virtualization, Proceedings of the 19th Symposium on Operating System Principles, pages 164-177, October 19-22, 2003 • A Whitaker, M Shaw, S Gribble, Scale and Performance in the Denali Isolation Kernel, Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pages 195-210, December 9-11, 2002