  1. Reflections on the Development ofActive & Programmable Networks Ken Calvert University of Kentucky Laboratory for Advanced Networking IWAN 2005

  2. Some Personal History • First AN discussions ~1995 w/Zegura, Bhattacharjee • Initial skepticism “Why would you want to do that?” • Excitement • About the power of the concept • About the prospect of developing a new Internet • Resignation • About the difficulty of convincing people of the need “What we have seems to work fine?” • DARPA Active Nets (later FTN) program from June 1997 through June 2004 • CANEs project (Georgia Tech) • ActiveCast (University of Kentucky/Georgia Tech) • Architectural Framework, 1998 • Defined roles, interfaces for EEs, NodeOS IWAN 2005

  3. What Active Networking is About(From a presentation ca. 2000) Research Challenge: What abstractions make up the programming interface? • What node functionality is useful to applications? • How can services be composed while preserving correctness and performance? • How can the network be protected against malicious or malfunctioning programs? • How can a programming interface scale to 108 users and nodes, 106 concurrent flows, 109 packets/second? IWAN 2005

  4. Three Projects • CANEs (~1997-2002) • Concast (~1999-2003) • Ephemeral State Processing (~1999-present) IWAN 2005

  5. CANEs Project Goals • Prototype an EE supporting structured composition of independently-implemented network-based functionality • Modularity = primitive elements + composition mechanism Model: Unix tools awk/grep/ls/cat/sort/... + pipes • Show benefits of user-controlled functionality in the net • “Bring application knowledge and network knowledge together in space-time” • Application-specific adaptation to congestion • In-network caching • Reliable multicast (with UMass/TASC) • Mobility IWAN 2005

  6. CANEs Project Goals • Allow for fast forwarding of “plain old vanilla” traffic • Generic Forwarding Function that could be implemented in hardware • Reason formally about composition and resulting global behavior • Establish correctness of the underlying fixed functionality • Identify sufficient conditions for user-supplied code to preserve that correctness IWAN 2005

  7. CANEs Packet-Processing Model Generic Forwarding Function predefined “slots” customizing code (e.g. active congestion control) outgoing channels incoming channels IWAN 2005

  8. Application: Intelligent Discard for MPEG • Principle: P, B frames depend on I frames • Frames spread over many packets • GOP = (typically) one I frame, few P frames, many B frames • Discard approaches: • Discard application-layer units (e.g. Frames, GOPs) • Static priorities (e.g., I frame higher than P, B) • Drop P, B if corresponding I already dropped • Evict P, B from queue to make room for I • Evaluation metrics: • Application-layer quality (e.g., SNR, I-frames received) • Network impact (e.g., Received bytes discarded) IWAN 2005

  9. Experiment Configuration Background traffic source Active IP router Bottleneck link (2 Mbps) MPEG source (avg rate 725 kbps) IWAN 2005

  10. Result: I-frames Received One active router, bottleneck 2Mbps, MPEG source averages 725 Kbps IWAN 2005

  11. Result: Data Discarded at Receiver IWAN 2005

  12. Result: Frame-by-frame Behavior IWAN 2005

  13. CANEs: Lessons Learned • Few applications need customized processing at every hop • Capsule model is overkill • Useful, powerful model:system-supplied fixed processing + user-supplied variable processing • Fixed functionality can be hardened, optimized • User-supplied functionality can be constrained for safety • Eases burden of proving correctness • Less general than language-based approaches • Importance of timer-driven processing • Importance of naming • topologies • reusable configurations of underlying+injected programs IWAN 2005

  14. Three Projects • CANEs (~1997-2002) • Concast (~1999-2003) • Ephemeral State Processing (~1999-present) IWAN 2005

  15. ActiveCast • Scalability through anonymity: • Deploying active code should not require • Explicit knowledge of topology • Enumeration of specific sites •  hide details of finding, activating nodes New Ideas dest=any within 2km of pt. x, with capabilities ... • Network service scalability through anonymity: • Deploying active code must not require • Explicit knowledge of topology • Enumeration of specific sites •  hide details of finding, activating nodes • Concast: N-to-1 service, dual of multicast • Single address represents many senders • Many sends  one receive • Anycast, Speccast • Packets delivered to any/every node satisfying user specification • Ephemeral State Processing • Use small, fixed amount of state;short, fixed lifetime Anycast X deploy Concast merge Impact Schedule prototype anycast implementation design & specify concast, anycast APIs • Enable application-friendly active networks by packaging the power of programmable network platform into easy-to-use, yet customizable network services. • Manifold increase in efficiency/scalability of applications by hiding details of group size in all aspects—extend benefits of multicast to both directions of transmission. • Applicable to wide range of many-many applications: sensor data collection/fusion, routing/dissemination of real-time data. prototype concast service prototype net recon service evaluate, refine concast spec. June 1999 May 2002 2001 2000 anycast specification language analyze design parameters for network recon publish concast, anycast APIs comparative analysis of anycast performance University of Kentucky: K. Calvert, J. Griffioen Georgia Institute of Technology: E. Zegura IWAN 2005

  16. R R R R S S R R R R R R Concast: Motivating Problem • Many multicast applications involve feedback: • Retransmission requests for reliability • Loss rates for congestion avoidance • Sending feedback via existing channels is ugly • Sender deals with individual receivers, destroying abstraction • Implosion limits scalability • No Many-to-One channel exists! IWAN 2005

  17. R S R S S R S R R S R S Our Solution: Concast Network Service • Scalability through abstraction • Single identifier (concast group ID) represents an arbitrary number of senders. • Benefits both receiver and network • Multiple sends result in a single message delivery • Trade additional processing in routers for reduced bandwidth requirements IWAN 2005

  18. Concast Semantics • Conservative: hardwire various merge semantics into the network, user selects at flow setup time • Liberal: user specifies merge computation to be carried out by the network (intermediate systems) • E.g., by downloading Java bytecodes • Challenge: • Allow customization of merge semantics • Within a practically-implementable framework • That limits resource consumption (and other dangers) IWAN 2005

  19. Concast Programming Interface Application-defined Merge Specification comprises: • getTag(IPDatagram): Tag • Returns a tag extracted from the packet • Packets p, p’ merged iff getTag(p)=getTag(p’) • merge(MergeState, IPDatagram, FlowState): MergeState • Updates state of the merge computation for incoming packet • done(MergeState): boolean • Returns “true” when ready to forward merged packet • buildDatagram(MergeState): IPDatagram • Constructs the packet to be forwarded from saved state IWAN 2005

  20. ProcessDatagram(IPAddr R, ConcastGroupID G, IPDatagram m) { FlowStatefsb; MergeState s; Tag t; fsb = lookupFlow(R,G); if (fsb null) { t = fsb.getTag(m); s = getMergeState(t, fsb); s = updateTTL(s, m); s = fsb.merge(s, m, fsb); if (fsb.done(s)) { (s, m) := fsb.buildDatagram(s) forwardDG(fsb, s, m) } putMergeState(fsb, s, t); } // else drop quietly } Merge Specification Generic Hop-by-Hop Processing IWAN 2005

  21. Ways to Use Concast • Application-specific merging • Filtering, aggregating telemetry • Merging media streams (demonstrated: audio, video) • Application-independent merging protocols • Collecting maximum (or any associative, commutative operator) of group members’ sent values • E.g. reliable multicast feedback • Protocol-independent generic services • Duplicate suppression (based on hash of IP payload) • Aggregation of small packets (TCP acks) [ICNP 2000] IWAN 2005

  22. Small-packet Aggregation: The Problem • Small packets require disproportionate resources • There is always a fixed per-packet overhead • Today: forwarding lookups most expensive • Router performance: packets/second (not bytes/sec) • Goal: minimum-sized datagrams at wire speed • Small packets are a significant fraction of traffic • TCP acknowledgements: 40 bytes • CAIDA (1998 data): • Half of all packets: 50 bytes or less • 60% of all packets: 100 bytes or less IWAN 2005

  23. Solution Idea • Aggregate small packets traveling in the same direction into larger packets • Delay small packets for aggregation • Break up downstream (at destination) for ultimate delivery • Benefits • For network: reduced switching load in some places • Amortize one forwarding lookup over multiple packets • For application: • Fewer lost acks  better throughput • Dangers • For network: increased processing load in some places • For application: reduced throughput under certain conditions IWAN 2005

  24. Multiplexing With Concast: Senders Applications TCP UDP Mux Concast IP Network Interface IWAN 2005

  25. R: Multiplexing With Concast: Routers IP Concast Mux Merge Network Interface Network Interface IWAN 2005

  26. Multiplexing With Concast: Receiver TCP UDP Demux Concast IP Network Interface IWAN 2005

  27. Encapsulation Header 0 Encapsulation Header 1 Multiplex Packet Structure IP Header Initial Time-to-Live Max Total Delay Allowed Amount Delayed So Far Max Per-hop Delay Multiplex Header Payload 0 Source Address Original TTL Protocol Payload Length Payload 1 ... IWAN 2005

  28. Router Processing Context Multiplex packets y x w k Concast Processing x z Non-concast packets k w Holding area for delayed multiplex packets IWAN 2005

  29. Evaluating Effectiveness:ns2 Simulation Study • Example application: Web Server • Many simultaneous TCP connections • Multiplex TCP acks only • Simulated workload • 4KB web page transmitted to 200 clients • Two traffic scenarios • Low-loss: minimal UDP cross traffic • High-loss: add 40 TCP flows cross traffic • All senders specify same Max Local Delay • Effectiveness Metrics • Total throughput of all connections • Fraction of aggregated acks IWAN 2005

  30. Expected Behavior • Increasing delay  increased aggregation • Decreased loss due to packet-oriented queue sizes • Increased throughput • Increasing delay increases RTT • Longer slow-start • Longer completion time • Decreased throughput • Aggregation more effective when queues are full IWAN 2005

  31. High-Loss Scenario IWAN 2005

  32. Low-loss Scenario IWAN 2005

  33. Concast Partial Deployment Benefits • Edges of the network generally have greater compute/transmission bandwidth ratio • Deploy concast at domain egress nodes S S S S S S S S S R IWAN 2005

  34. Number of Packet-Hops To Collect a Value From Every Group Member 4900-node Transit-stub Graphs Partial Deployment Effectiveness IWAN 2005

  35. Concast: Lessons Learned • Partial deployment (at domain boundaries) can provide substantial benefits • Fixed+variable framework provides a “defensible” programming interface • Trust is a potential show-stopper • Problem: setting up concast sessions across multiple service providers • Mutual distrust • Providers want only paying customers to get this premium service • Users want only trustworthy providers’ nodes handling their packets • Anonymity means users have to rely on providers to enforce their policies • Problem-specific solutions are easier to sell than generic platforms • Experience with IETF Reliable Multicast Transport working group IWAN 2005

  36. Three Projects • CANEs (~1997-2002) • Concast (~1999-2003) • Ephemeral State Processing (~1999-present) IWAN 2005

  37. The Building-Block Approach to Extending Router Functionality • Internet Protocol philosophy: • Keep router functions simple • Push responsibility for constructing end-to-end services to End Systems • Building block function should be: • Flexible: Applicable to more than one kind of problem ...including presently unknown problems • Useful: Deployable today ... to solve (or assist in solving) one or more “real” problems • Scalable: IP-like (i.e. bounded) resource requirements ...that can be handled on/near the fast path in hardware IWAN 2005

  38. User-controlled State • Conventional wisdom: Not scalable • Too expensive to provide for 100K flows • Overhead: setup, soft-state refresh, garbage collection • Limiting factors • Time-space product of memory usage (per flow) • Management overhead (including trust) • Solution: Ephemeral State • A fixed-lifetime, associative store • Store and retrieve fixed-size values • Identified by fixed-size, randomly-chosen tags • Bindings persist for a fixed time then vanish • No management overhead IWAN 2005

  39. put(37051,1) put(37051,4) get(37051) get(37051) get(37051) time returns null  seconds Ephemeral State Store • Set of pairs of natural numbers (t,v) • At most one pair in the set for any value of t • Access functions • put(t, v): establishes “the set contains (tag, value)” • get(t): if  v such that (t,v) is in the set, returns v else returns null returns 1 returns 4 IWAN 2005

  40. ESP: Ephemeral State Processing • Ephemeral State Store (ESS) • Associative memory = set of (64-bit tag, 64-bit value) pairs • One ESS per processing context • Packet-borne “instructions” (one per packet) • Each instruction defines a fixed-length computation • Operands: values in ESS, packet fields, node-specific values • Comparable to machine instructions of general-purpose computer • On termination, forward or discard packet • Routers support a common instruction set • Wire protocol • ESP instructions carried in payload or shim header • Packets recognized and executed hop-by-hop • End-to-End services • Construct by sequencing packets in time and space IWAN 2005

  41. Example ESP Instruction COUNT p: packet p.C: tag carried in ESP hdr p.thresh: immediate value carried in ESP hdr Effect: bind tag C to count of packets processed; forward ones for which count < p.thresh on arrival  = get(p.C) if ( is null)  = 0  =  + 1 put(p.C, ) if ( p.thresh) forward p else discard p IWAN 2005

  42. ESP ESP ESP ESP ESP ESP Input/Output/Central Processing Contexts Switch Fabric ESP Output Context Normal IP Input Processing Normal IP Output Processing IWAN 2005

  43. ESP ESP ESP ESP ESP ESP Input/Output/Central Processing Contexts Switch Fabric ESP Output Context Normal IP Input Processing Input Context Normal IP Output Processing IWAN 2005

  44. ESP ESP ESP ESP ESP ESP Input/Output/Central Processing Contexts Switch Fabric ESP Output Context Normal IP Input Processing Input Context Normal IP Output Processing Both Contexts IWAN 2005

  45. ESP ESP ESP ESP ESP ESP Input/Output/Central Processing Contexts Switch Fabric ESP Output Context Normal IP Input Processing Input Context Normal IP Output Processing Both Contexts Central Context IWAN 2005

  46. Uses for ESP • Controlling packet flow • Make drop/forward decisions based on state at node or i/f Example: Duplicate suppression • Identifying interior nodes with specific properties • Reveal just enough of topology to find what is needed Example: Finding multicast branch points for unicast-based service • Processing user data • Simple hierarchical computations scale better Example: Aggregating feedback from a multicast group (a la Concast) IWAN 2005

  47. Network-Processor Implementation Add ESP to an existing router: Non-ESP packets pass straight through Network Processor Off-the-shelf Router Network Processor IWAN 2005

  48. Example Application Add ESP to an existing router: Non-ESP packets pass straight through Off-the-shelf Router MAC ESP ESP IWAN 2005

  49. Example Application Add ESP to an existing router Non-ESP packets pass straight through ESP packets diverted for processing Off-the-shelf Router MAC ESP ESP IWAN 2005

  50. Implementing ESP on the IXP 2400 Performance One MicroEngine (out of 8), 600 MHz Stream = “COUNT” Instructions Input rate = 954 Mbps F = fraction of packets creating new tag bindings IWAN 2005

