380 likes | 405 Views
Evaluate, measure, & report network reliability for commercial communications & internet. Analyze outage trial results & provide recommendations for improvement. Enhance outage reporting guidelines. Report on outage causes & best practices to ensure network service quality.
E N D
Focus Group 2Network Reliability PJ Aduskevicz, AT&T Ross Callon, Juniper Networks Wayne Hall, Comcast Cable Communications
Focus Group Mission Statement • Define reliability measurements (units) for commercial communications networks (i.e., wireline and wireless transport networks, including satellite and cable) and for the Internet by March 22, 2003. • Define reasonable, measurable customer-affecting outage reporting thresholds for commercial communications networks (i.e., wireline and wireless transport networks, including satellite and cable) and for the Internet by March 22, 2003. • Conduct voluntary outage reporting trial, collect data, analyze results, and report on the validity, usefulness, and timeliness of the process and information obtained, and make recommendations for improvement. • Based on trial results (including information on services affected by an outage), evaluate and report on the reliability of public communications network services in the United States. • Should the Commission initiate an inquiry or rulemaking with respect to any of the above-mentioned issues, the Focus Group will provide input to the NRIC, which may make formal recommendations as a part of such proceeding(s). • Evaluate, and report on, the reliability of public telecommunications network services in the United States.
Voluntary Outage Reporting Trial Results to 9/4/03 Note: Reporting Organizations are participating companies that have submitted either an outage report or a Positive Report during the specified month Note: Data collection for August in progress
Voluntary Outage Reporting Trial Results to 9/4/03 Note: Positive Reports are reports to the NCS/NCC that the participating organization did not experience an outage during the specified month Note: Failure Location for one outage report in May was undetermined Note: Data collection for August in progress
Voluntary Outage Reporting Trial Results to 9/4/03 Note: Data collection for August in progress
Voluntary Outage Reporting Trial Results to 9/4/03 • Causes of Outages Reported Included: • Fiber Optic Cable Damage, Procedural Errors, Hardware Failures, Software Design Errors, and Power Equipment Failures • Applicable Best Practices Included: • 6-5-0662 on routine testing of generator engines and power alarms, • 6-5-0536 on deployment of security and reliability related software updates, • 6-5-0710 on excavator protection of underground facilities, • 6-5-0588 on Awareness Training field and management personnel, • 6-5-0697 on employing an "Ask Yourself" program to reinforce the responsibility every employee has to ensure flawless network service
Voluntary Outage Reporting TrialInitial Observations • Participation in the outage trial improved as a result of direct outreach to technical contacts of the participating organizations. • Outage reporting process and guidelines have been significantly upgraded from trial learning's. • Multi-industry Subject Matter Experts analysis of reports improved understanding of nature and scope of outage. • Root/direct causes of voluntary trial outages generally mirror those now used by NRSC. • Existing NRIC Best Practices have sufficed for analysis to date. • Useful data can be obtained from a voluntary outage trial in a multi-network environment. • Non-disclosure agreements, while difficult to negotiate, were essential to increase participation in the voluntary outage trial.
Voluntary Trial Process Improvements Outage Reporting Guidelines Re-notification to all participating enterprises Technical Contacts Positive Reporting NCS/NCC Process Resources for NCC to assist process implementation Process Flow Enhancement and Clarification of Scrubbed Data Clarification of Service Provider review of scrubbed data before it is is passed from the NCC to the Focus Group, and clarification and understanding of the Data Elements to be passed Confirms enterprise sensitive data is removed Sufficiently protects “security” information Model Outage Reports Outage Report Withdrawal Process Positive Communication with All Technical Contacts Focus Group 2 Final Report Contents Voluntary OutageReporting Trial Guidelines Voluntary OutageReporting Trial Data Analysis Results Comparison to CFR 47 §63.100 Data Analysis Findings and Lessons Learned Recommendations on Voluntary Outage Reporting Trial Recommendations on CFR 47 §63.100 Network Reliability Steering Committee (NRSC) Report on the Public Switched Telecommunications Network (PSTN) Reliability Reporting – NRSC 2Q03 Report
Reliability Reporting Network Reliability Steering Committee (NRSC) Analysis Reports
Reliability Reporting – NRSC 2Q03 ReportFCC Reportable Service Outages(by number of events) Total outages (18) were at the lowest level of any quarter, and lower than any four consecutive quarters since the start of the Baseline Period (94).
Reliability Reporting – NRSC 2Q03 ReportFCC Reportable Service Outages(by outage index) The aggregated outage index was at the lowest level (1084) of any four consecutive quarters since the start of the Baseline Period.
Analysis of the outages for 2Q03 indicates: Total outages (18) were at the lowest level of any quarter and significantly lower than the Baseline Level. Total outages were lower than in any four consecutive quarters since the start of the Baseline Period (94). Facility outages (7) were significantly lower than the Baseline Level. Facility outages were lower than in any four consecutive quarters since the start of the Baseline Period (36) CO Power outages (8) were the lowest of any four consecutive quarters since 2Q96 to 1Q97. The aggregated outage index was at the lowest level (1084) of any four consecutive quarters since the start of the Baseline Period. Procedural Error as a root cause of outages (5) was at the lowest level of any quarter and significantly lower than its Baseline Level. Procedural Error outages were lower than in than in any four consecutive quarters since the start of the Baseline Period (35). Based upon analysis of all outages reported from 1Q93 through 2Q03, the NRSC notes that: There is a statistically significant decreasing trend in total outages since 2000. There is a statistically significant decreasing trend in frequency of Facility outages since 1995 and in the aggregated outage index since the start of the Baseline Period. There is a statistically significant decreasing trend in frequency of Local Switch outages since 1997 and in the aggregated outage index since the start of the Baseline Period. There is a statistically significant decreasing trend in frequency of CO Power outages over the last two years. There is a statistically significant increasing trend in frequency of CCS outages over the last seven years and in the aggregated outage index since 1994. The outage index for DCS outages is significantly higher from 1997-2002 as compared to 1993-1996. Procedural Error as the root cause of outages has exhibited a statistically significant decline in frequency since 1997. Reliability Reporting – NRSC 2Q03 Report
NRSC 2002 Annual Report Table of Contents • Introduction • Major Findings • Background • State of the Network • Root Cause Analysis • “Special” Outages • Conclusion Report Due Out October 1, 2003
Reliability Reporting – NRSC 2002 Annual Report 2002 Snapshot F = Outage Frequency I = Aggregated Outage Index
Reliability Reporting – NRSC Reports The NRSC urges all service providers and equipment vendors to review all best practices for application in their operations. These Best Practices may be found at: http://www.nric.org/
Focus Group 2 Back Up
PJ Aduskevicz, AT&T Bonnie Amann, Sprint Jay Bennett, Telcordia Johnathan Boynton, SBC Ken Buckley, Federal Reserve Bob Burkhardt, Nextel Ross Callon, Juniper Rick Canaday, AT&T Kevin Cavanagh, AT&T Wireless John Chapa, SBC John Clarke, NCS/NCC Wayne Chiles, Verizon Joe Craig, Qwest Bernie Farrell, NCS David Fears, Cox Lee Fitzsimmons, Nextel Brian Goemmer, Western Wireless Jeff Goldthorp, FCC Wayne Hall, Comcast John Healy, FCC Dean Henderson, Nortel Dennis Pappas, Qwest Gary Pellegrino, CommFlow Resources Christopher Quesada, PAIX.net Karl Rauscher, Lucent Tony Reed, Charter Arthur Reilly, Cisco Systems Ira Richer, The Telesis Group Jim Runyon, Lucent Falguni Sarkar, AT&T Wireless Andy Scott, NCTA Don Smith, NCS Scott Smith, Cox Ron Stear, C&W Sandy Stephens, Focal Dorothy Stout, NCS/NCC Lee Taylor, RoxTel Whitey Thayer, FCC Nate Wann, NCS/NCC Frances Wentworth, NCS/NCC Chris Whyte, Microsoft Doug Williams, Comcast Cable Linna Zile, Cox Focus Group Membership • Michael Hill, Level 3 • Bob Holley, Cisco • Robin Howard, Verizon • Bruce Johnson, Verisign • Rick Kemper, CTIA • Percy Kimbrough, SBC • Bill Klein, ATIS • Bernie Ku, MCI • Jim Lankford, SBC • Greg Larson, Exodus/CWUSA • Mike Lecocke, SBC • Chris Liljenstolpe, CW • Chris MacFarland, Allegiance • Spilios Makris, Telcordia • Archie McCain, BellSouth • Dave McDysan, MCI • Brian Micene, AT&T Wireless • Denny Miller, Nortel • Erick Mogelgaard, Cox • Brad Nelson, Marconi • Kent Nilsson, FCC • Chris Oberg, Verizon Wireless
Internet/IP Service Provider XYZ Corp DNS RADIUS DHCP DSL Wireless Cable Dial-Up Core Backbone Inter-Domain Routers Distribution PSTN Service Aggregation
Type of Internet Access Customer Definition Comments Cable Household Whether they are actively using it or not at the time of the outage Dial-Up Dial-Up Port Whether or not port is in use at time of outage DSL Household Whether they are actively using it or not at the time of the outage Satellite Household Whether they are actively using it or not at the time of the outage Wireless Customers or Blocked Calls Historical trends may be used Outage Reporting Customer Definitions
Service Provider NCS / NCC Focus Group 2 NRIC VI & Industry NCS/NCC Logs Report Scrubbed Data Received by FG2 Outage Occurs Determine if Outage meets Trial Criteria final initial Scrub Data per Criteria established By FG2 Analyze Scrubbed Data, Review BP coverage If no Local Analysis If yes initial final Collect, Validate and Send Data Create Initial Report Within 3 days 3 Provide Status at NRIC VI Council meetings Monitor Progress 1 2 Send to NCS / NCC Concur or Provide Input on Recommendations Make Recommendations Conduct Root Cause Analysis Identify Best Practice Send Final Report Within 30 Days Develop Final Report to Include Recommendations File Final Report Approve Scrubbed Report NRIC VI Voluntary Outage Reporting Trial Process
Voluntary Trial Outage Reporting Guidelines Compiled Guidelines for use by Service Providers, which includes: • Units and Thresholds – To determine which outages are to be reported; • Report Contents – To determine what information will be reported in confidential outage reports to a trusted third party under NDA; • Report Sanitizing – To determine what information is to be scrubbed from the confidential report before the report is made available to NRIC participants; • Confidential Report Repository – To determine which organization will be responsible for handling and sanitizing the confidential reports; • Reporting Process – To determine the process for reporting during the voluntary trial period. The data collected during the voluntary outage reporting trial is intended for use in improving network reliability, such as by providing information useful in order to verify and improve the NRIC best practices, or to create study groups to understand and improve issues identified as a result of the data. Specifically, it is not appropriate for reported data to be used for Marketing nor for Public Relations purposes.
Reporting carrier / service provider Contact person Telephone number of contact person Start date Start time of impact Geographic area affected Estimated number of customers affected Types of services affected (if applicable) Duration of outage (hours and minutes) Apparent or known cause Name of equipment involved [OPTIONAL] Type of equipment involved [OPTIONAL] Specific part of network involved Methods used to restore service [OPTIONAL] Steps taken to prevent recurrence Root cause and trouble found [OPTIONAL] Applicable best practice [OPTIONAL] Contents of Confidential Outage Reports
Scrubbing of Confidential Reports Scrubbing of Outage Reports • Reporting carrier / service provider is deleted. This is replaced by a unique numerical identifier for the outage. • Contact person (name, telephone number, email address if present) is deleted. • Date of incident is left unchanged. • Time of incident is left unchanged. • Geographic Area affected is made less specific. Only the city or general geographic area is maintained in the scrubbed report. The reporting service provider can work with the NCC/NCS to determine how general the geographic area should be after the scrubbing operation. • Name and type of equipment involved is deleted. • Other fields are left unchanged in the scrubbed report. • Geographic Area affected is made less specific. Only the city or general geographic area is maintained in the scrubbed report. The reporting service provider can work with the NCC/NCS to determine how general the geographic area should be after the scrubbing operation. • Other fields are left unchanged in the scrubbed report. Scrubbing of Positive Reports • For positive reports, the only information which is maintained after the scrubbing function is the industry segment, and the month.
Voluntary Outage Report Withdrawal Process • Initial Outage Reports are filed within 3 days of the event. • If the Service Provider determines, upon further investigation, that the outage did not meet the Voluntary Outage Reporting Trial criteria, then the Service provider should notify the NCS/NCC that the Initial Outage Report is being Withdrawn. • The NCS/NCC will track the number of Initial Outage Reports that have been Withdrawn for inclusion in analysis reports. • Otherwise, the Final Outage Report is submitted within 30 days of the event.
Allegiance AT&T AT&T Wireless BellSouth C&W CenturyTel Charter Communications Cingular Comcast Cable Communications Cox Communications EarthLink Focal Communications Intelsat Level 3 MCI McLeod Nextel NSF PanAmSat Qwest Sprint SBC T-Mobile Time Warner Cable Verisign Verizon Verizon Wireless Western Wireless Participating Organizations Providing Technical Contacts
NRSC Back Up
Reliability Reporting – NRSC 2Q03 ReportIncidents by Failure Category(Facility) There is a statistically significant decreasing trend in the number of Facility outages since 1995.
Reliability Reporting – NRSC 2Q03 ReportIncidents by Failure Category(Local Switch) There is a statistically significant decreasing trend in the number of Local Switch outages since 1997.
Reliability Reporting – NRSC 2Q03 ReportIncidents by Failure Category(CO Power) There is a statistically significant decreasing trend in the number of CO Power outages over the last two years.
Reliability Reporting – NRSC 2Q03 ReportIncidents by Failure Category(Common Channel Signaling) There is a statistically significant increasing trend in the number of CCS outages over the last seven years.
Reliability Reporting – NRSC 2Q03 ReportProcedural Error Attributed Outages(by number of events) Procedural Error as the root cause of outages has exhibited a statistically significant delcine in frequency since 1997.
Reliability Reporting – NRSC 2002 Annual ReportOutage Frequency by Year
Reliability Reporting – NRSC 2002 Annual ReportAggregated Outage Index by Year
Reliability Reporting – NRSC 2002 Annual Report Outage Frequency Failure Category Distribution by Year
Reliability Reporting – NRSC 2002 Annual Report Failure Category Distribution 2002 Versus Baseline
Reliability Reporting – NRSC 2002 Annual Report Root Cause Category Distribution 2002 Versus Baseline
Overall Outage frequency and aggregated outage index in Green region Below network growth rates since 1993 Lowest to date Outage frequency significantly lower than in Baseline Years Facility Outage frequency in Green region Lowest to date Significantly lower than in Baseline Years Below baseline level for 3rd consecutive year Aggregated outage index in Green region Lowest to date Significantly lower than in Baseline Years Below baseline level for 4th consecutive year Local Switch Outage frequency in Green region Lowest to date Significantly lower than in Baseline Years Below baseline level for 4th consecutive year Aggregated outage index in Green region Significantly lower than in Baseline Years Below baseline level for 5th consecutive year Tandem Switch Outage frequency in Green region Lowest to date Below baseline level for 2nd consecutive year Aggregated outage index in Green regionbelow baseline level for first year since 1999 Reliability Reporting – NRSC 2002 Annual Report Summary
CCS Outage frequency in Green region Above baseline level for 3rd consecutive year Declined for 2nd consecutive year Aggregated outage index in Yellow region Highest to date Above baseline level for 3rd consecutive year CO Power Outage frequency in Green region Below baseline level for first year since 1996 Aggregated outage index in Green region Below baseline level for first year since 1997 DCS Outage frequency in Green region Above baseline level for 5th year out of last 6 years Aggregated outage index in Green region Below baseline level for 2nd year out of last 6 years Other Outage frequency and aggregated outage index in Green region Both below baseline levels for first year since 1999 Outage frequency matched lowest value to date Procedural Errors Outage frequency in Green region Lowest to date Significantly lower than in Baseline Years Below baseline level for first year since 1996 Aggregated outage index in Green region Below baseline level for first year since 1999 Reliability Reporting – NRSC 2002 Annual Report Summary