140 likes | 282 Views
New Directions in Enterprise Network Management. Aditya Akella University of Wisconsin, Madison MSR Networking Summit June 2006. Enterprise Network Management. Very broad topic… Tuning performance and availability of network-attached services Traffic sniffing for trouble-shooting
E N D
New Directions inEnterprise Network Management Aditya AkellaUniversity of Wisconsin, Madison MSR Networking SummitJune 2006
Enterprise Network Management • Very broad topic… • Tuning performance and availability of network-attached services • Traffic sniffing for trouble-shooting • Monitoring utilization • Mapping network topology and resources, etc. • Several tools (both commercial and free) • Tailored to enterprises of different sizes, requirements
Outline • Enterprises desire specific management functionalities that current tools fundamentally cannot provide • Three examples • Inability arises from how enterprises are designed and operated today (IP-based) • Decentralization and no control over routing • Thoughts on enterprise network design principles • … Simplified management is a side-effect
So What’s Missing? • Cumbersome or impossible to support • What-If analysis • Effective trouble-shooting • Fine-grained resource management • Some tools may provide one of these • No tool provides all of them
1. What-If Analysis New config stable? Will bottleneck disappear?Will upgrade violate policy? Decentralized config specification • Complex config/policy split across several devices/mechanisms • Firewalls, Proxies, NATs, router ACLs, VLANs, port filtering • … And across different network layers • Hard to reason about cross-layer, cross-device interaction • What will happen if I change X in my network? • Policy/control plane level • Reason about connectivity before installing changes New link/network upgrade New policiesfor sales Alternate configuration
2. Trouble-Shooting How many conns from sales? Who is using access link? • What is the current “status” of my network? • Who is talking to whoand how? Resource consumption? • Avoid overload; control plane trouble shooting • Information at arbitrary granularities • Users, machines, groups… • Ability to go back in time • Unexpected patterns of communication; Protocol usage How many connections from guests? Finance grpprotocol usage last week?
2. Trouble-Shooting • Today… • SNMP for tracking resource consumption Coarse-grained • Monitoring key resources Application specific; not network-wide • Inference Rely on heuristics, error prone • Not fine-grained enough Distributeddecision on whether to allow flows • Distributed and/or local to services and devices • By default all-to-all is allowed • Something is undesirable local restrictions • Use appropriate mechanism (ACLs, port filters, firewalls etc.) • Poll to figure out what’s going on, or infer • Hard to archive control-plane events
3. Resource Management Sales virus-1 +image-filter + compression • Route around overloaded/failed switches and links • Connection latency • Availability • Control levels of resource consumptions • Prioritize applications or users • Restrict bandwidth consumption of “sales” • Middle-boxes and proxies • Placed at network choke points • Ideally, deploy at diverse locations • Route different classes of flows via different middleboxes Guests restrict b/w X Products virus-2+ compression
3. Resource Management • Limited or no support in enterprises today • SNMP-based/manual tuning, OSPF, load-balancing using DNS Lack of tight control over routing • Forwarding tables, hop-by-hop dst IP based routing inflexible • Very little info used for routing • Additional info into forwarding tables complexity; slow look-up • Aggregation No control over flows or groups of flows • Need tighter, app flow-level control • Forwarding tables fundamentally insufficient
Desiderata A B using HTTP C D using AIM via proxy A D using AIM via filter… Should AD be allowed? • Centralization: • Of config specification (who can access what and how) • Of enterprise-wide decision-making (should flow X be allowed) • What-if analysis or connectivity becomes trivial • (Offline) Analysis of a central database of policies • Troubleshooting and forensics is simple • Current set or complete log of accepted conn requests or active flows A C B D
Tight control over routing: Centrally pre-ordain the path of each flow No more designing around choke-points Easy to integrate arbitrary number/type of middle-boxes Fine-grained resource control Also aids trouble-shooting and what-if analysis Desiderata Route AD (HTTP) through s1p1s3s2 Route AD (AIM) through s1p1p2s2 A C B D
An Architectural View • Take all configuration and decision-making out of switches, routers • Put all eggs in one basket • Central entity tells switches how to forward packets • Wire a circuit for each new flow… • … Or hand out a source route Switches have no forwarding table • Dumb forwarding elements • Under the direct control of the central controller (via control channels)
Effect on Management • Control-plane related management or monitoring easy to do • How many connections per users? • Upgrades violate policy? • Who accessed service X? • Route different flows differently • React to failures/overload • “Data-plane management” harder to do • Band-width related • E.g. Restrictions on users; Monitor Utilization
Data Plane Management • Switches need to be slightly less dumb • Minimal management support to enable data plane management? • Counters per-flow? • Per-flow queuing? • Up-to-date link utilization? • Push vs pull based?