310 likes | 448 Views
Network Performance Management. S. Keshav C/NRG (with Rosen Sharma, Andy Choi, Wilson Huang, Lili Qiu, Russell Schwager, Rachit Siamwalla, Jia Wang, and Yin Zhang). Motivation . Networks are increasing in breadth…. greater density of connections PCs come with built-in networking
E N D
Network Performance Management S. Keshav C/NRG (with Rosen Sharma, Andy Choi, Wilson Huang, Lili Qiu, Russell Schwager, Rachit Siamwalla, Jia Wang, and Yin Zhang)
Motivation • Networks are increasing in breadth…. • greater density of connections • PCs come with built-in networking • ADSL and cable modems • wireless networking • as well as in depth • variety of qualities, policies, and media
The current situation • Loss of productivity from • slow file access • web site disconnection • slow access to a web site • no one knows exactly why! • Greater breadth and depth => even more dependency on the network => even more problems
Is QoS enough? • Lots of research in the area of QoS • RSVP, differential service etc. provide a good overall user experience, one stream at a time • Is QoS all there is to a good user experience? • An incorrect reservation poor service for one stream • A misconfigured router complete loss of service to one or more ports!
Aha! • User experience is affected more by ‘mundane’ network management than by ‘exotic’ QoS research • This motivates our entire research effort
Why networks fail • Link or router failure • Transient overload • Unanticipated increase in load • Misconfiguration Increasingly harder to detect
Need Better Network Management • Current approaches • GUI-centric • lots of flashing lights, but no intelligence • Can detect failures but... • ad hoc capacity planning • ad hoc configuration • no way of testing other than “just try it!” • Can’t manage network performance
Performance management Topology discovery Configure new hardware (simulation) Collect statistics (monitoring) Fix problems (AI and simulation) Identify problems (display and simulation)
Discovery: Project Octopus Temporary Set Heuristic Permanent Set
Techniques • DNS-ls • SNMP • Random probe • Traceroute • Directed broadcast ping
Results • Have automatically discovered entire CS department topology • As well as entire Stanford topology (> 220 subnets) • Cornell topology is being discovered as we speak! • info being shared with CIT
Monitoring • A PERL script uses SNMP and queries a router using various MIB entries. • The MIB entries are stored in an input file. • The values gathered from the router are stored in a file. • The script works on both UNIX and WinNT.
Monitoring (contd.) • Other PERL scripts parse the data and convert it to other formats. • Currently supported formats: • HTML - The data is presented in a table format in HTML. • GNUPlot graphs - The data can be graphed or saved in pbm format
A Case Study: CSGate2 • From 2/19/98 to 2/23/98, the router CSGate2 was probed every 5 minutes recording various statistics on the data coming into and going out of the router. Incoming bytes at CSgate2
Display goals • We want to display multiple views • Views should be dynamic • Shoul allow expansion and contraction • Rapid creation of user interface • Reusability of GUI components
Solution: Script Java • Component-based system • Reusable manageable components • Can build large manageable applications • Sharing over the web • Record and playback
Architecture • Use JavaScript/Visual Basic as the scripting language • Use Java to write components • Create a adapter hierarchy for the current AWT components
Objects HTML pages Java structures intelligence protection by namespace Data Model linearized data structures java perl javascript Script Java Communication Abstraction • multicast channels
Advantages • Allows us to glue components using a scripting language, allowing rapid prototyping and development • New components can be easily integrated • For large applications, a lot of the complexity and chaos can be taken out of scripting
Advantages(cont.) • JavaScript can be streamed from the server, allowing for presentations and sharing • Dynamic Html • layers are windows • these windows render html
Storage goals • We need to store topology and monitoring results somewhere • Database: too structured and too much overhead • File system: not enough semantics • Idea: treat URL as a file system link and HTML tags as associated semantics
WebFS • HTML tags allow arbitrary semantic abstractions • Manipulate these abstractions to present a virtualized file system • grep -headings *.html • sed ‘/<annot tag=foo>/jdbc(“tags.db”, “foo”)/’
The magic bullet: simulation • Realistic simulation where networking subsystem interacts with other parts of kernel • Fast simulation for large networks ( > 1000 hosts) • Hide the abstraction of simulated network, same API as system calls
machine gated msg Telnetd ping Kernel wrapper Kernel core FreeBSD kernel User Space gated traps Telnetd ping Sockets Network Stack
machine gated msg Telnetd ping Kernel wrapper Kernel core Simulated machine • Task based approach • a trap sends a message to kernel • an upper call is a message from kernel • All components of simulated machine live on same process Simulated link
Capture network related system calls, file descriptor auto re-mapping. Virtual file system root Single-thread kernel, therefore no need for locking More on simulated machine
Simulated network machine gated msg Telnetd ping Kernel core
Integrating with real network • Use U-Net to interact with external device • Router has the illusion of being in a physical network • Test equipment before actual deployment Unet Physical Router
Tradeoffs • Balance between realism and speed • Using FreeBSD as basis for realistic simulation • Using session level simulation to speed up • Ease of porting applications
Open issues • Fault identification • Bayesian networks? • Ensemble of experts? • Other AI approaches? • How to do session-level simulation? • Configuring real systems • IP9000