Scalability and Accuracy in a Large-Scale Network Emulator

Scalability and Accuracy in a Large-Scale Network Emulator Amin Vahdat, Ken Yocum, Kevin Walsh, Priya Mahadevan, Dejan Kostic, Jeff Chase, and David Becker Presented by Stacy Patterson

Outline • Motivation • ModelNet Design • Evaluation • Conclusion

Motivation • Need a way to test large-scale Internet services • Peer-to-peer, overlay networks, wide area replication • Testing in the real world • Results not reproducible or predictable • Difficult to deploy and administer research software • Simulation tools • Allows control over test environment • May miss important system interactions • Emulation • Network emulators can subject application traffic to end-to-end bandwidth constraints, latency, and loss rate of user specified topology • Previous implementations not scalable

ModelNet • A scalable, cluster-based, comprehensive network emulation environment

Design • User run configurable number of instances of application on Edge Nodes within dedicated server cluster • Each instance is a Virtual Edge Node (VN) • Each VN has a unique IP address • Edge nodes route traffic through cluster of Core Routers • Equipped with large memories and modified FreeBSD kernels • Can emulate configured target network traffic • Core routers route traffic through emulated links or “pipes” each with its own packet queue and queuing discipline

ModelNet Phases • (1) Create • Generates a network topology in GML - Graph with vertices: clients, stubs, transits and edges: network links • Can be generated from Internet traces, BGP dumps, synthetic topology generators, etc. • Users can annotate graph with packet loss rates, failure distributions, etc • (2) Distillation • Transforms GMLgraph into pipe topology

ModelNet Phases • (3) Assignment • Maps pipe topology to core nodes, distributing emulation load across core nodes • Finding ideal mapping is NP-complete problem. Depends on routing, link properties and traffic load. • ModelNet uses greedy k-clusters assignment • For k core nodes, randomly select k nodes in distilled topology. Greedily select links from connected component in round robin.

ModelNet Phases • (4) Binding • Multiplex multiple VNs to each physical edge nodes • Bind each physical edge node to a core router • Generate shortest path routes between all VNs and install in core routing tables • (5) Run • Executes target application code on edge nodes

Inside the Core • Route traffic through emulated “pipes” • Each route is an ordered list of pipes • Packets move through pipes by reference • Routing table requires O(n2) space • Packet Scheduling • When packet arrives, put at tail of first pipe in its route. • Scheduler stores heap of pipes sorted by earliest deadline - exit time for first packet in its queue • Once every clock tick • Traverse pipes in heap for packets that are ready to exit • Move packets to tail of next pipe or schedule for delivery • Calculate new deadlines • Multi-core Configuration • Next pipe in route may be on different machine • If so, core node tunnels packet descriptor to next node

Scalability Issues • Traffic traversing core is limited by cluster’s physical internal bandwidth • ModelNet must buffer up to full bandwidth-delay product of target network. • 250 MB of packet buffer space to carry flows at aggregate bandwidth of 10 GB/s with 200 ms roundtrip latency. • Assumes perfect routing protocol

Evaluation • Core routers - 1.4 Ghz Pentium III, 1 GB memory • Connected using 1 GB switch • Edge nodes - 1 Ghz Pentium III 256 MB memory • Connected using 100 MB/s

Baseline Accuracy • Want to insure that under load, packets are subject to correct end-to-end delays • Used kernel logging to track ModelNet performance and accuracy • Results show that by running ModelNet scheduler at highest kernel priority • Packets are delivered within 1ms of target end-to-end value • Accuracy is maintained up to 100% CPU usage

Capacity • Quantify capacity of ModelNet as function of load and number of emulated hops • Tested 1-5 edge nodes • Each edge node hosts up to 24 netperf senders and 24 netperf receivers • Topology connects each sender to a receiver

Capacity

Scalability • Additional Cores • Adding core routers allows ModelNet to deliver higher throughput • Communication between core routers introduces overhead. Higher cross-core communication results in less throughput benefit • VN Multiplexing • Higher degrees of multiplexing enable larger network emulation • Inaccuracies introduced due to context switching, scheduling, resource contention, etc

Accuracy vs. Scalability • Reduce overhead by deviating from target network requirements • Changes should minimally impact application behavior • Ideally, system reports degree and nature of emulation inaccuracy

Distillation • Pure hop-by-hop emulation • Distilled topology is isomorphic to target network • High per packet overhead • End-to-end distillation • Remove all interior nodes in network • Collapse each path into single pipe • Latency = sum of latencies along path • Reliability = product of link reliabilities along path • Low per packet overhead • Does not emulate link contention along path

Distillation - continued • Walk-In • Preserve the first walk-in links from edges • Interior links replaced with full mesh • Does not model contention in interior • Walk-out • Extension to walk-in to support interior link contention • Preserves set of links in interior • Collapses paths between walk-out and walk-in sets

Evaluating Distillation • Ring topology with 20 routers interconnected at 20 MB/s each • Each router as 20 VNs • Routers partitioned into generator and reciever sets • 419 pipes shared between 400 VNs • End-to-end distallation contains 79,800 pipes • Last mile distillation preserves 400 edge links • Test distribution of bandwidth between nodes

Evaluating Distillation

Changing Network Statistics • ModelNet allows users to modify pipe parameters while emulation is in progress • User can change bandwidth, delay and loss rate of set of links • Also support for modeling node and link failures

Case Studies • Able to evaluate a 10,000 node network of unmodified Gnutella clients • 100 edge nodes with 100 VNs each • Extensions support emulation of ad hoc wireless networks • Broadcast communication and node mobility • CFS • Able to reproduce CFS implementation running on RON testbed • ModelNet results closely match CFS/RON in all cases

Case Studies • Replicated Web Services • Need to investigate replica placement and routing policies under realistic wide-area conditions • Study effects of replication on client latencies using 2.5 minute trace of requests to www.ibm.com • Adaptive Overlays • ACDC, an adaptive overlay system that tries to build routes that deliver better cost, delay or both. • 600 nodes in topology, 120 of them in overlay network • Test the behavior of the system to increasing delays between links • Results very similar to experiment performed under ns2

Conclusion • ModelNet provides an emulation environment that allows • Testing of unmodified applications • Reproducible results • Experimentation using broad range of network topologies and characteristics • Large scale experiments (thousands of nodes and gigabits of cross traffic)

Scalability and Accuracy in a Large-Scale Network Emulator