340 likes | 482 Views
MPLS Network Tuning: How to Squeeze Most From Your Network. Swarup Acharya ( acharya@bell-labs.com ) Technical Manager, Multiservice Networking Research Group Optical Networking Division, Bell Laboratories.
E N D
MPLS Network Tuning:How to Squeeze Most From Your Network Swarup Acharya (acharya@bell-labs.com) Technical Manager, Multiservice Networking Research GroupOptical Networking Division, Bell Laboratories
This talk will focus on Network Management challenges in delivering the lower CapEx, OpEx promise of MPLS Multi-Protocol Label Switching • MPLS has emerged as the foundation of next-generation data networks • Provides the underpinnings of the converged network vision • “Connection-oriented” veneer on an inherent connectionless IP network • IP/MPLS market continues to grow • Year-over-year traffic growth: 119% (2002), 118% (2003), 84% (2004) [Infonetics Research] • Drivers for data network convergence • CapEx Savings, OpEx savings, Competition, Convergence..
MPLS Traffic Engineering • The cornerstone of MPLS is Traffic Engineering (TE) • Given a new demand, how best to route it in the network? • No longer limited by IGP (OSPF, IS-IS) restrictions of: • Destination-based forwarding • Simple “additive” min-cost routing, ignorant of bandwidth • MPLS TE enables: • Constrained-based routing • Bandwidth, delay etc.. • Explicit routing (aka Source routing) • Label Switched Paths (LSP) • With appropriate resources reservations via RSVP
Path from L1-L3 Path from L2-L3 B B D D E E L2 Traffic Engineered Paths L3 TE enables a more load-balanced network.. C A A L1 TE Benefits L2 IGP Shortest Path Routing L3 C L1 “Longer” paths under-utilized, “Shorter” paths bottlenecked!
TE Necessary, But is it Sufficient? • General belief that Traffic Engineering enables efficient MPLS networks • “Traffic engineeringreduces the overall cost of operations by more efficient use of bandwidth resources… .. [Cisco documentation] • Will TE alone provide all the efficiency you can get? • MPLS + TE moves the network from packet-switched to a “virtual” circuit-switched IP network.. • What about bandwidth inefficiencies inherent in circuit-switched environments such as SONET,SDH, ATM…? • Will MPLS suffer the same fate?
SONET/SDH Bandwidth Fragmentation • Primary cause of poor SONET/SDH efficiency • Circuit churn leaves behind “stranded” capacity New requests often denied even if sufficient capacity exists • Often at only ~30-40% network utilization • SONET NMSs have had “constrained-based” routing for a while now.. • Bandwidth accounted in routing • Arcane SONET/SDH constraints • Increasingly, defragmentation tools used to “recover” capacity • Often, in conjunction with newer Optical Control Planes
e e e f f c d SONET/SDH Link Fragmentation Before • By rearranging traffic carried on different time slots, capacity on the link can be freed and reused 2.5 Gbps Link on a Fiber New request for a622 Mbps demand rejected! 622 Mbps 622 Mbps 622 Mbps 622 Mbps a b c d After 622 Mbps 622 Mbps 622 Mbps 622 Mbps Request accommodated! a b Legend/Glossary a
Is there an MPLS analogue? SONET/SDH Ring Fragmentation Alternate Routing of Same Demands on Ring 7-Node STS-48 BLSR Ring New STS-3 demand from Node2-4 rejected! Time Slots Acceptedhere Nodes
L3 L2 L1 D B F No available bandwidth? Or, Is the bandwidth fragmented? C A E MPLS Network Example : Bandwidth of all Links L1, L2, L3: LSPs of /2 b/w New Service Request: Setup LSP L4 between A and C, Bandwidth Router A rejects request: No available route meeting b/w requirements
L4 D B F L2 L1 L3 C A E Avoiding Bandwidth Fragmentation Alternate routing for L1-L3, enables L4 to be met • TE alone does not guarantee high efficiency • L1-L3 were “optimally” routed in both cases, yet fragmentation occurs • Key:Can the demands be routed without adding new hardware? • Lower Fragmentation Higher Utilization Lower CapEx
Network Tuning • Fragmentation is a problem for MPLS networks too • In general, Network Management systems need to provide Network Engineering tools to address fragmentation • “Traffic Engineering puts traffic where the bandwidth is, Network Engineering creates bandwidth where the traffic will be..” • Relatively little focus on engineering tools • Network engineering requires “global” knowledge, TE is a per-LSP optimization • However, network engineering operation cannot be service disruptive • Network Tuning: Hitless, Disruption-free Network Engineering • Network tuning is NOT network planning • For live, operational networks, not greenfield designs
Re-cap • Did the Net-Heads check with the Bell-Heads as to what NM quagmire they were getting into? • MPLS Traffic Engineering: • Helps avoid congestion prevalent in native IP networks • Limited ability to mitigate circuit-switched inefficiencies • On its own, cannot extract the most juice from the network • Critical need for Network Tuning tools • No reason why MPLS cannot become equally inefficient down the road.. Key Tradeoff: Grow infrastructure to meet traffic demand [CapEx Hit], OR, Tune network for improved efficiency [OpEx Hit]?
Can I have automated, scalable Network Tuning tools? Network Tuning Scenarios The router rejected a new LSP setup request due to insufficient bandwidth. Can I engineer the current LSP routes to “free” the necessary bandwidth for the new one? I need to bring down a router for an OS upgrade. Can I: • Re-route the LSPs on the router to avoid bringing them down? • Upgrade the router OS and then revert them back to their original routes? The traffic on a node/set of links had exceed the recommended load threshold. Can I move traffic from the “hot zone” to minimize damage in case of failure?
Bell Labs Möbius Tool:MPLS Provisioning, Tuning System Support For: - Cisco 72*/75*/120* - JNPR M*, T* - ERX Link color indicates load (Red: high, Green: acceptable load)
Optimization Done! Network Tuning: Fail-Setup Optimization
Impacted Circuits (Old + New Routes) 1: Re-route 3: Provision 2: Re-route
Clear Traffic in this “Hot Zone” below specific threshold.. Network Tuning: Load Balance (“Hot Zone” Clearing) Network View (After) Network View (Before)
L3 L2 L1 L2 L3 F F D D B F D B B L1 L4 L2 L3 A C C A A E E C E L1 Network Engineering Requirements • Step-by-step Migration Sequence • Operating on live traffic -- providing a design for an “optimized” layout does not help • How do I get from current LSP layout to the new layout? • Original LSP QoS constraints have to be maintained on new route • Re-route L2 • Re-routeL3 • Provision L4
Algorithmic Challenge Hot Zone Load Balancing • Migration a very challenging theoretical problem (“NP-hard”) • Problem of scale -- exponential search space • Requires innovative algorithms for large networks • How to scale to a network with 10s of routers, 100s of links and 1000s of LSPs? In Bell Labs, we have patent-pending algorithms to provide migration sequence in “real-time” • Milliseconds to seconds for reasonable sized networks • Hot Zone size = 10% Network • Goal: Clear all LSPs in Hot Zone • Chart shows contribution of LSPs • outside the hot zone in meeting the • goal as network loads increase
Requirements II:Hitless, Disruption-free Engineering “Hitless” is not zero packet loss • In reality, everything is only near-hitless • Requirement: It should be perceived as hitless from the application’s perspective • SONET/SDH has a 50ms grace during protection switching • Even with a 100 ms hit, can do >500 re-routes before a 4-9s reliability SLA is broken. • MPLS provides infrastructure for hitless re-routing
MPLS make-before-break • Mechanism to achieve hitless LSP re-routing • Signal new route, switch traffic and delete old route • Signaling protocols use intelligence when reserving bandwidth on new route • E.g., Shared Explicit (SE) style flag in RSVP to avoid double bandwidth reservation on common links • Possibility of packets going out-of-order • If new route is significantly shorter than original route • Even if it occurs, very short duration and bounded
Inducing make-before-break • Typically,make-before-break is an internal function • Used by router to re-route LSPs (e.g, if ‘re-optimize’ flag on) • For network engineering, make-before-break needs to be triggered from the outside. Also: • New path is given (as opposed to router calculating it) • If the new path is bad, traffic should not switch and bring the LSP down! • Routers need to provide mechanism to trigger make-before-break • Backdoors available - varies by vendor OS • Insert the new path with a higher priority (lower path option) and force a re-optimization • Replace the Explicit Route Object (ERO) with a new route
Requirements III: Preserve Network Stability Should not bring down customer traffic • Key: Traffic should switch only if new path is up! • LSP re-routes does not change the IP topology • Only the path is changed, not the connectivity • Will cause OSPF updates • Bandwidth on links will change • Should be attempted during “lean” traffic periods
More traffic for same infrastructure CapEx Savings from Tuning • Simulation Model • 40 Node, 100 Link network (to start) • LSP Traffic randomly generated and routed on shortest available path • On failure to setup LSP due to lack of bandwidth: • Case I: No Engineering • A new link added between source and destination • Case II: With Engineering • Attempt to re-route LSPs to create “free” space for new one • On failure, new link added between source and destination ---No Engineering---With Engineering NetworkGrowth with Increased Traffic
CapEx Savings - II • Alternative Simulation Model • 10 Node, 15 Link network (to start) • LSP Traffic randomly generated and routed on shortest available path • On failure to setup LSP due to lack of bandwidth: • Case I: No Engineering • A new link added on the hop that is out of capacity • Case II: With Engineering • Attempt to re-route LSPs to create “free” space for new one • On failure, new link added on the hop that is out of capacity
Network Management Systems • Traditional IP networking view is that router has all the smarts • Engineering “intelligence” requires network-wide view • Has to reside in a single “entity” • Too complicated to co-ordinate engineering operations across different routers in a distributed fashion • Good choice: MPLS Network Management System (NMS) • NMS needs to provide support for: • Algorithmic and graphical tools for what-if scenarios • Seamless, point-n-click support to execute optimizations • Support for proactive engineering • No longer limited to “reactive” operational mindset • Requisite NMS tools can lower operations overhead from hours/days to minutes!
Lucent’s Navis Provisioning ManagerComponent-based, Multi-vendor Layer 2/3 NMS NavisProvisioning Manager Service Modules Order Gateway ATM xDSL Frame Relay Work Manager Inventory Gateway VPN MPLS Service Modules ATMoMPLS Ethernet IPSEC Routing & MPLS TE L2&3 L2&3 Network Adaptors Möbius Fast Network Adaptors Creation • Large Multi-vendor Testing Labs • Committed Corporate Partnership with Hardware Vendors • Component-Based, Multi-Vendor Activation • Rich set of Services over L2 & L3 • Software Development Kits for Network Element Support Activation Flow Configuration Flow Core IP/MPLS EMS / NE FR/ATM EMS / NE Access EMS / NE
Conclusions • MPLS gaining momentum in service provider networks • Industry focus on Traffic Engineering • Does not suffice to get the most from the network • Need to also consider network engineering and hitless tuning • MPLS provides the necessary infrastructure for network tuning • Necessary requirement to avoid inefficiencies in circuit-switched environments • Effective network tuning improves network utilization, lowers CapEx!