1 / 14

New Directions in Enterprise Network Management

New Directions in Enterprise Network Management. Aditya Akella University of Wisconsin, Madison MSR Networking Summit June 2006. Enterprise Network Management. Very broad topic… Tuning performance and availability of network-attached services Traffic sniffing for trouble-shooting

Download Presentation

New Directions in Enterprise Network Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Directions inEnterprise Network Management Aditya AkellaUniversity of Wisconsin, Madison MSR Networking SummitJune 2006

  2. Enterprise Network Management • Very broad topic… • Tuning performance and availability of network-attached services • Traffic sniffing for trouble-shooting • Monitoring utilization • Mapping network topology and resources, etc. • Several tools (both commercial and free) • Tailored to enterprises of different sizes, requirements

  3. Outline • Enterprises desire specific management functionalities that current tools fundamentally cannot provide • Three examples • Inability arises from how enterprises are designed and operated today (IP-based) • Decentralization and no control over routing • Thoughts on enterprise network design principles • … Simplified management is a side-effect

  4. So What’s Missing? • Cumbersome or impossible to support • What-If analysis • Effective trouble-shooting • Fine-grained resource management • Some tools may provide one of these • No tool provides all of them

  5. 1. What-If Analysis New config stable? Will bottleneck disappear?Will upgrade violate policy?  Decentralized config specification • Complex config/policy split across several devices/mechanisms • Firewalls, Proxies, NATs, router ACLs, VLANs, port filtering • … And across different network layers • Hard to reason about cross-layer, cross-device interaction • What will happen if I change X in my network? • Policy/control plane level • Reason about connectivity before installing changes New link/network upgrade New policiesfor sales Alternate configuration

  6. 2. Trouble-Shooting How many conns from sales? Who is using access link? • What is the current “status” of my network? • Who is talking to whoand how? Resource consumption? • Avoid overload; control plane trouble shooting • Information at arbitrary granularities • Users, machines, groups… • Ability to go back in time • Unexpected patterns of communication; Protocol usage How many connections from guests? Finance grpprotocol usage last week?

  7. 2. Trouble-Shooting • Today… • SNMP for tracking resource consumption  Coarse-grained • Monitoring key resources  Application specific; not network-wide • Inference  Rely on heuristics, error prone • Not fine-grained enough Distributeddecision on whether to allow flows • Distributed and/or local to services and devices • By default all-to-all is allowed • Something is undesirable  local restrictions • Use appropriate mechanism (ACLs, port filters, firewalls etc.) • Poll to figure out what’s going on, or infer • Hard to archive control-plane events

  8. 3. Resource Management Sales  virus-1 +image-filter + compression • Route around overloaded/failed switches and links • Connection latency • Availability • Control levels of resource consumptions • Prioritize applications or users • Restrict bandwidth consumption of “sales” • Middle-boxes and proxies • Placed at network choke points • Ideally, deploy at diverse locations • Route different classes of flows via different middleboxes Guests restrict b/w X Products  virus-2+ compression

  9. 3. Resource Management • Limited or no support in enterprises today • SNMP-based/manual tuning, OSPF, load-balancing using DNS  Lack of tight control over routing • Forwarding tables, hop-by-hop dst IP based routing inflexible • Very little info used for routing • Additional info into forwarding tables  complexity; slow look-up • Aggregation  No control over flows or groups of flows • Need tighter, app flow-level control • Forwarding tables fundamentally insufficient

  10. Desiderata A  B using HTTP C  D using AIM via proxy A  D using AIM via filter… Should AD be allowed? • Centralization: • Of config specification (who can access what and how) • Of enterprise-wide decision-making (should flow X be allowed) • What-if analysis or connectivity becomes trivial • (Offline) Analysis of a central database of policies • Troubleshooting and forensics is simple • Current set or complete log of accepted conn requests or active flows A C B D

  11. Tight control over routing: Centrally pre-ordain the path of each flow No more designing around choke-points Easy to integrate arbitrary number/type of middle-boxes Fine-grained resource control Also aids trouble-shooting and what-if analysis Desiderata Route AD (HTTP) through s1p1s3s2 Route AD (AIM) through s1p1p2s2 A C B D

  12. An Architectural View • Take all configuration and decision-making out of switches, routers • Put all eggs in one basket • Central entity tells switches how to forward packets • Wire a circuit for each new flow… • … Or hand out a source route  Switches have no forwarding table • Dumb forwarding elements • Under the direct control of the central controller (via control channels)

  13. Effect on Management • Control-plane related management or monitoring easy to do • How many connections per users? • Upgrades violate policy? • Who accessed service X? • Route different flows differently • React to failures/overload • “Data-plane management” harder to do • Band-width related • E.g. Restrictions on users; Monitor Utilization

  14. Data Plane Management • Switches need to be slightly less dumb • Minimal management support to enable data plane management? • Counters per-flow? • Per-flow queuing? • Up-to-date link utilization? • Push vs pull based?

More Related