1 / 31

Maximizing IP/MPLS Network Availability - Methods and Calculations

Understand the importance of network availability in IP/MPLS networks, measurement methods, design examples, and the impact on services. Learn about the Port Method, Bandwidth Method, and the calculations involved in ensuring network reliability.

swansonj
Download Presentation

Maximizing IP/MPLS Network Availability - Methods and Calculations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Availability of IP/MPLS networks Sanjay Kalra October 2002

  2. Agenda • Introduction • How to measure Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 2

  3. Definition of Availability Availability is the probability that an item will be able to perform its designed functions at the stated performance level, within the stated conditions and in the stated environment when called upon to do so. Availability = Reliability Reliability + Recovery

  4. Quantification

  5. PSTN End-2-End Availability 99.94% PSTN : The Yardstick ? • Individual elements have an availability of 99.99% • One Cut off call in 8000 calls (3 min for average call). Five ineffective calls in every 10,000 calls. NI NI 0.005 % 0.005 % AN 0.01 % AN 0.01 % LE LE Facility Entrance Facility Entrance NI : Network Interface LE : Local Exchange LD : Long Distance AN : Access Network LD 0.005 % 0.005 % 0.02 % Source : http://www.packetcable.com/downloads/specs/pkt-tr-voipar-v01-001128.pdf

  6. Services affect on Network Availability • In IP Network Availability is a function of the Service being offered. Source : www.t1.org

  7. IP Network Expectations H L L L : Low M : Medium H : High

  8. Agenda • Introduction • How to measure Network Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 8

  9. The Port Method • Based on Port count in Network • Does not take into account the Bandwidth of ports e.g. OC-192 and 64k are both ports • Good for dedicated Access service because ports are tied to customers. (Total # of Ports X Sample Period) - (number of impacted port x outage duration) x 100 (Total number of Ports x sample period)

  10. The Port Method Example • 10,000 active access ports Network • An Access Router with 100 access ports fails for 30 minutes. • Total Available Port-Hours = 10,000*24 = 240,000 • Total Down Port-Hours = 100*.5 = 50 • Availability for a Single Day = (240000-50/240,000)*100 = 99.979166 %

  11. The Bandwidth Method • Based on Amount of Bandwidth available in Network • Takes into account the Bandwidth of ports • Good for Core Routers (Total amount of BW X Sample Period) - (Amount of BE impacted x outage duration) x 100 (Total amount of BW in network x sample period)

  12. The Bandwidth Method Example • Total capacity of network 100 Gigabits/sec • An Access Router with 1 Gigabits/sec BW fails for 30 minutes. • Total BW available in network for a day = 100*24 = 2400 Gigabits/sec • Total BW lost in outage = 1*.5 = 0.5 • Availability for a Single Day = ((2400-0.5)/2,400)*100 = 99.979166 %

  13. (number of impacted customers x outage duration) ] x 10-6 DPM = [ (total number of customers x sample period) Defects Per Million • Used in PSTN networks, defined as number of blocked calls per one million calls averaged over one year.

  14. Defects Per Million Example • 10,000 active access ports Network • An Access Router with 100 access ports fails for 30 minutes. • Total Available Port-Hours = 10,000*24 = 240,000 • Total Down Port-Hours = 100*.5 = 50 • Daily DPM = (50/240,000)*1,000,000 = 208

  15. Agenda • Introduction • How to measure Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 15

  16. Calculating Availability: Series E1 E2 E3 Multiplicative method:E1 x E2 x E3= As .999999 x .999999 .999991 x = .9999890 Additive method of UA (unavailability) = .0000110 .000001 + .000001 + .000009 Total Availability of a system (As) is always less than the least available element. One Weak Link Significantly Weakens This Chain!

  17. E1 E2 Calculating Availability: Parallel For 1 out of 2 redundancy.. Additive Rule: As = E1+ E2 – E1 E2 As = .999999+.999999-(.999999*.999999) As = .999999999999 Multiplicative Rule: As = 1–[(1-E1)(1-E2)] Not for Parallel Systems Where Both Elements Are Required Assumption is that Switchover Time is zero

  18. System Calculation: Series Simple E-3 Network, With One E-3 Trunk E-3 Server ATM ATM 1 4 2 3 5 99.98 99.99 99.992 99.992 99.95 99.9959 99.9959 99.9959 99.9959 99.9959 Availability 99.8835% Yearly downtime = (1-Availability) * 525600 minutes/year

  19. System 1 availability 99.6341 99.9845 99.9831 99.9831 99.9831 99.9831 99.9563 99.95 99.975 99.8200 99.9750 99.9932 99.975 99.82 99.82 99.95 99.9831 99.9563 99.9831 99.9831 99.9831 99.9831 99.9831 Systems 2 availability 99.4311 S1 & S2 network 99.9979 Availability, Data Centre to Customer CPE 99.9661% System Calculation: Parallel (1) Internet Gateway Data Centre Core Edge CPE E-3 Edge ATM Hub Core Server STM-16 STM-1 Core E-3 Edge ATM Hub Data Centre Core Core Core

  20. System 1 Availability 99.6958 System 2 Availability 99.4828 Availability, Data Centre to Customer CPE 99.9974% System Calculation: Parallel (2) 99.9845 was 99.6341 Internet Gateway 99.9831 99.9831 99.9831 99.9831 Data Center 99.9932 Core Edge Edge Core NxE-1 99.999 99.975 99.8200 Server 99.9850 CPE STM-16 STM-1 Core E-3 99.9850 99.975 99.82 99.82 99.999 Edge NxE-1 Edge Data Center Core 99.9831 99.9932 99.9831 99.9831 99.9831 Core Core was 99.4311 99.9831 99.9831 was 99.9661 !!! 3 9’s to 4 9’s

  21. Agenda • Introduction • How to measure Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 21

  22. Router Redundancy • Typical Network Designs have 2 routers for • Redundancy • Capacity Planning • Redundancy in routers • Power Supply • Fans • Routing Engines • Switching Planes • Forwarding plane Do we still need two routers or one is enough?

  23. One Router Versus two Routers Redundant • Control Plane • Forwarding Plane • Power Supply • FAN • Line Card Link Availability = Router Availability 99.99979 Router Full Internal Redundancy (99.99979) HW Cost of two Router Configuration is 110%of one router configuration OC-48 LH No Redundancy at Router Level (99.99015) Link Availability = Parallel System Availability 99.999999

  24. One Router Advantages • Cost Savings • Lower OPEX • Faster convergence • For some PE Routers Single Router might be the only option!! • As Service State is maintained on per flow basis for some network based services (e.g. Firewall, NAT) • TDM links are usually connected to a single edge router • A lot of customers terminate on a single router

  25. One Router Disadvantages • Single Point of failure • Configuration and Upgrade has to be exact • Capacity Management has to be exact • Main cost of a router is line cards and not chassis • What if there is a DOS attack against the router ?

  26. One Router Disadvantages • Physical Maintenance is not possible without downtime (Location Change) • Still need protection against link failure • Physical separation to prevent against natural disasters is not possible • Networks have been always designed with two routers !!!

  27. Agenda • Introduction • How to measure Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 27

  28. SW to HW Reliability Differences • Software reliability is not a function of manufacturing • Software does not degrade over time • Physical Environmental changes have no affect • All software failures are the result of design/user errors

  29. SW to HW Reliability Differences • Software can only be repaired by redesign • MTTR is not measurable since code must be rewritten to fix a bug. • Software bugs can be highly contagious • The science of software correctness is still immature and is difficult to apply to software as complex and quickly changing as IP routing

  30. Agenda • Introduction • How to measure Availability • Network Design example • One Router vs. Two Routers • Software Dependability • Summary 30

  31. Summary • No standard way to measure IP Availability • Availability in IP networks depends on the Service being offered • One vs. two Routers choice depends on requirements • Lot of development happening in IP networks to improve Availability • Graceful Restart, NSF, Fast Reroute …

More Related