340 likes | 352 Views
This article discusses the challenges and methods of simulating the Internet, including models of behavior, varying parameters, controlled environments, and reproducible results. It also explores the problems in characterizing the Internet, such as scale, drastic change, heterogeneity, link and topology heterogeneity, protocol heterogeneity, application dynamics, and traffic.
E N D
Simulating the Internet:challenges & methods Kevin Fall Network Research Group, Lawrence Berkeley National Laboratory Berkeley, CA USA
LBNL’s Network Research Group: Members: Van Jacobson, group leader Kevin Fall Sally Floyd * Craig Leres Vern Paxson * http://www-nrg.ee.lbl.gov
Outline • Simulating the Internet is not easy • The VINT project: an effort in Internet-style simulation
Simulations for Network Research • Models of interesting behavior • Easily-varied parameters • Controlled environment, reproducible results
Problems in Characterizing the Internet • Large Scale: • even a small fraction of misbehaving entities is non-negligible • scale stresses assumptions in protocol design and implementation • Drastic Change: • will the rate of change continue? • predominant use not obvious (e.g. the web, continuous media, ?) • Heterogeneity everywhere!
Link and Topology Heterogeneity • Delay and bandwidth span 5 to 6 orders of magnitude! • 20msec to 2s round-trip prop delay • 10Kb/s to 10Gb/s bandwidth range • Topology • hierarchy and clustering chosen by ISPs • performance tied to which path packets take in network • paths may change dynamically • IP routes are frequently asymmetric
Protocol Heterogeneity • Adaptive and non-adaptive Internet protocols • react to congestion (TCP) • nonreactive (UDP) • Application Dynamics • multi-protocol interactions • user activity • application mix varies greatly by site • Implementations may not be consistent
Traffic • Internet traffic not easily characterized • no commonly accepted model • traffic may be shaped by congestion response • Dependent on source behavior • application protocol limitations • new applications • pricing policies
So, what can be done in simulation? • Strategy • 1: Look for invariants • 2: Explore the parameter space • 3: Understand the limits of simulation
1: Searching for Invariants • What do we really know about Internet dynamics? • How tocharacterizestatistically? • traffic • users • sessions • congestion, etc. • Mathematical simplicity does not imply accuracy
The Self-Similar Nature of Traffic • packet arrivals not exponentially distributed • thus, arrival process is not Poisson • bursts over multiple time-scales • they exhibit long-range dependence • suggests self-similar models • (there is still contention on this point) • Implications • aggregation does not “smooth out” variation • traffic synthesis more difficult • network buffering may be much less effective than thought based on Markovian models
User-generated Sessions look Poisson • user-generated session arrivals look Poisson (machine-generated connection arrivals are not) • distribution is invariant, parameterized only by a (fixed, hourly) rate
Network Activity tends to have a heavy-tailed distribution • Examples: packets in a user’s TELNET session; bytes in FTP-DATA transfers • distribution looks Pareto with 0.9 < b < 1.0 • Pareto distribution with shape b has: • infinite mean if b <= 2 • infinite variance if b <= 1 • This type of Pareto has infinite mean and variance (and is very unlike an exponential) • burstiness remains across aggregation
2: Exploring the Parameter Space • Consider a large range for parameters • recall, 5-6 orders of magnitude range in bandwidth and delay • note that behavior is often non-linear in parameter values • Repeat, repeat, repeat • topology generators • randomness
3: The Limits of Simulation • Simplified Models • useful for gaining intuition and exploring parameters • danger of oversimplification • Need for a Reality Check • compare simulation results with measurement • Internet measurements often offer “surprises”
The VINT Project(Virtual InterNet Testbed) • USC/ISI: Deborah Estrin, Mark Handley, John Heideman, Ahmed Helmy, Polly Huang, Satish Kumar, Kannan Varadhan, Daniel Zappala • LBNL: Kevin Fall, Sally Floyd • UCBerkeley: Elan Amir, Steven McCanne • Xerox PARC: Lee Breslau, Scott Shenker • VINT is currently funded by DARPA through mid-1999
VINT Goals • provide common platform for network research • explore issues of scale and multi-protocol interaction • Specific Areas: • multicast, end-to-end transport • simulation scaling • traffic management • emulation
Multicast Research • Reliable Multicast Transport • Large Scale • “SRM”-- Scalable Reliable Multicast • Multicast Congestion Management • Group formation • (still ongoing) • Layered Transmission • layered encoding • dynamic multi-group join/leave
Simulation Scaling • Simulator capable of 1000s of nodes • Want 100,000s of nodes (or more) • “Session” Abstraction • abstract away some simulation details • trade detail for time/space • scales simulation by about 10X
Traffic Management • Active Buffer Management • Random Early Detection Gateways • Explicit Congestion Notification (ECN) • Packet Scheduling • Class-Based Queuing (CBQ) • Round-Robin and Fair Queuing Variants • Differentiated Services • Admission Control • Reservation Support
Emulation • Interface Simulator with Live Network • Live Traffic Passes through Simulated Topology • Special “Real-Time” Scheduler • may not keep synchronized under load
The VINT Simulation Environment • Components: ns2 and nam • NS2 (network simulator, version 2): • Discrete-event C++ simulation engine • scheduling, timers, packets • Split Otcl/C++ object “library” • protocol agents, links, nodes, classifiers, routing, error generators, traces, queuing, math support (random variables, integrals, etc) • Nam (network animator) • Tcl/Tk application for animating simulator traces • available on UNIX and Windows 95/NT
NS Supported Components • Protocols: • TCP (2modes + variants),UDP, IP, RTP/RTCP, SRM, 802.3 MAC, 802.11 MAC • Routing • global topology map, classifiers • static unicast, dynamic unicast (distance-vector), multicast • Queuing and packet scheduling • FIFO/drop-tail, RED, CBQ, WRR, DRR, SFQ • Topology: nodes, links Failures: link errors/failures • Emulation: interface to a live network
Benefits • Common simulation environment • simulations expressed in scripting language • separate visualization tool • topology and “scenario” generators • modular structure is extensible; sources provided • Unique Features • Rich Protocol Set • “Session” abstraction • provides scaling simulations by a factor of 4 • Visualization and Emulation capabilities • separate Network Animator (nam) tool • low-level interface to system’s protocols
The NS Architecture • Simulator is a Object-Tcl “shell” • Split Objects • fine-grain, easily composed • objects exist both in C++ and Tcl Context • library handles object consistency
Work in Progress • Adaptive Web Caching (LBNL, UCLA) • Nam Improvements (USC, ISI) • Simulator Scaling (USC, ISI) • Simulator Addressing Hierarchy (USC, ISI) • Protocol Robustness (USC, ISI) • Emulation (LBNL, UCB) • Quality of Service (Xerox PARC) • Router-Based Congestion Control (LBNL) • Topology and “Scenario” Generation
Router-Based Congestion Control • Two main classes of traffic on Internet: • TCP (reduces sending rate in face of loss) • UDP (application decides when and how much to send) • Internet stability due in large part to TCP’s congestion response • Danger with growing use of UDP-based applications • UDP will “steal” bandwidth from TCP • currently no incentives to prevent this behavior
Encouraging Congestion Control • Combine RED Gateway with analysis and regulation • RED (Random Early Detection) Gateways: • keep smoothed average queue size measure • when measure exceeds threshold, drop or mark packets with increasing probability • a flow’s fraction of the aggregate random packet drop rate is roughly equal to it’s fraction of the aggregate arrival rate • Select candidate “bad” flows with high drop rate
“Bad” Flow Selection Criteria • Flow is not “TCP-friendly” • throughput exceeds factor times analytic model: • Flow is not responsive • does not alter arrival rate with increased packet drops • Flow is “high-bandwidth” • uses more than it’s “fair share”
Flow Regulation • Need bandwidth-regulating packet scheduler • CBQ • others • Use “good” and “bad” scheduling partitions • Bad partition gets allocation below current usage • decays over time with continued offered load • flows may be reclassified as “ok” if they adapt
Conclusion • Simulating the Internet is difficult • Simulation is useful, but must be used carefully • The VINT project a common simulation framework that addresses many of the issues
Additional Information • Web pages: • http://www-nrg.ee.lbl.gov/ • http://www-mash.cs.berkeley.edu/ns • http://netweb.usc.edu/vint • http://www.ito.darpa.mil/Summaries97/E243_0.html • NS Users Mailing list: • majordomo@mash.cs.berkeley.edu • “subscribe ns-users”