1 / 14

Evolution of IP/OL Performance Management

Evolution of IP/OL Performance Management. Robert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research. Outline. Key Takeaways Performance Management – must consider interlayer (focus IP) Evolution story for IP/OL Architecture for Long Haul Networks Example problems

maeko
Download Presentation

Evolution of IP/OL Performance Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolution of IP/OL Performance Management Robert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research

  2. Outline • Key Takeaways • Performance Management – must consider interlayer (focus IP) • Evolution story for IP/OL • Architecture for Long Haul Networks • Example problems • Next chapter in evolution • Let’s get it right this time Title

  3. Key Takeaways • Optical PM goals should focus on use in IP layer • Links in the IP layer form connections in the optical layer • Virtually all high rate connections are IP links (between either routers or Ethernet switches) • Perfect optical layer detection is a lofty goal, but • will fall short if architected in isolation • E.g., need to have strong inter-layer coordination • Why do we stress this for OL? • Inter-layer fault management has many flaws in practice, even after 15 years of SONET perfecting • Need adequate mechanisms across layers to handle scenarios when things go wrong or confusion reigns Title

  4. Evolution Story for Long Haul Networks IP Layer 1st Generation SONET Ring Layer Pt-Pt WDM Layer Router ADM DCS/Intelligent Optical Switch Degree-n OADM/WXC WDM Terminal Title

  5. Evolution Story for Long Haul Networks IP Layer 1st Generation SONET Ring Layer DCS Layer Pt-Pt WDM Layer Router ADM DCS/Intelligent Optical Switch Degree-n OADM/WXC WDM Terminal Title

  6. Evolution Story for Long Haul Networks IP Layer 2nd Generation 1st Generation SONET Ring Layer ULH/WXC Layer Pt-Pt WDM Layer Router ADM DCS/Intelligent Optical Switch Degree-n OADM/WXC WDM Terminal Title

  7. Evolution Story for Long Haul Networks IP Layer 3rd Generation 2nd Generation 1st Generation SONET Ring Layer ULH/WXC Layer Pt-Pt WDM Layer Router ADM DCS/Intelligent Optical Switch Degree-n OADM/WXC WDM Terminal Title

  8. Some of the problems we’ve encountered Ring switching impact on higher layers • Upper layer has timer – waits for lower layer to restore – Done! • Wrong! – not a simple decision on when to take IP link up and down IP Layer X SONET Ring Layer Title

  9. Some of the problems we’ve encountered1st Generation of IP/OL • SONET alarms received by upper layer are ambiguous and conflicting • Many error types in SONET: BER, AIS, P-LOS, clear during protection switching • Arrive at different times • Software bugs – routers don’t behave as expected • Inconsistencies in calculation of BER and IP layer holddown timer IP Layer PPP ACK; OSPF ping  AIS-P BER-P CLR AIS-P BER-P CLR X SONET Ring Layer LOS-L  LOS-L  AIS-P Title

  10. What is the source of these problems? • No standards for inter-layer interaction • Physical layer: testers need requirement scripts to test – no standard, no script • No industry requirement often means no testing, no sharing of behavior • Historically, L1 and L3 labs have been separate • Some members of Telecom community have integrated their labs • Software bugs – routers don’t behave as expected • No specification of common parameters and metric • Example: Router measures BER in fixed timer intervals • Router takes link down upon TCA (threshold exceeded) • Protection switching results in VERY short but high burst of error  Crosses router threshold even though it is << 10 ms! Title

  11. What is the source of these problems? • Shared Risk Groups still not well modeled • Single failure at lower layer results in multiple, scattered link failures at higher layer – network unprepared to restore • Example: portions of dual IP access links routed over same ring – both links taken down due to previous confusion LA NY SF Washington IP (logical) layer Physical (fibre) layer NY LA SF Washington Common SRLG Title

  12. Identity Crisis2nd Generation of IP/OL • High speed (2.5/10/40Gbs) IP links skip SONET ring/xconnect layer and instead route over long sequences of Point-to-point WDM systems, interconnected by O/E/O optical transponders • Should the Optical Path pretend to be a transparent (like dark fiber) • E.g., No AIS/BER TCA – re-transmit all LOS/LOP to Path Termination Points • How does one isolate faults for repair (OTs, Amplifiers, WDM Terms)? • OR: Should it display characteristics of SONET Section/Line/Path Fault Management Architecture? • However: then similar 1st Gen IP/SONET Ring confusion occurs • Practicality dictated that industry implemented a combination of both approaches Title

  13. ULH/WSC: The Final Solution?3rd Generation of IP/OL • Use long-term model of all-optical path to IP layer link • Two major issues to resolve • What if intermediate OEO exists in near-term? • How do we model restoration at OL and how does IP layer interact? • IP layer responsible for deciding link health • Fast link layer detection (LOS) • GIGE and other signal are going to be transported over the 3rd Gen OL • Is the set of PM alarms and TCAs we inherit from SONET appropriate for 3rd Gen OL? • If not, which ones or new ones should we define? Title

  14. Some potential approaches • OL only passes simple alarms to the upper layer, e.g., LOS. • Upper layer makes its assessments of BER, packet coding violations, ACK failures • OL still does fault isolation for OEO components or amplifiers (e.g., to WDM Term or EMS or Fault OSS), but NOT passed up to IP layer • Where/how do we do this? Standards, Fora, vendor interactions, carrier requirements? • Repair process: • Need to correlate what fails in the OL with what fails in the IP layer (1 to many map) • Network discovery of IP/OL relationships (e.g., SRLG) across layers would facilitate fault correlation process Title

More Related