1 / 36

Analysis of Trouble Tickets Issued by APAN JP NOC

Analysis of Trouble Tickets Issued by APAN JP NOC. Jin Tanaka tanaka@kddnet.ad.jp KDDI. APAN NOC Session in Busan, Korea on 27 August 2003. Agenda. Introduction to APAN JP Site NOC Statistics of Trouble Tickets Trouble analysis Equipment in TokyoXP TransPAC

joelmason
Download Presentation

Analysis of Trouble Tickets Issued by APAN JP NOC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Trouble TicketsIssued by APAN JP NOC Jin Tanaka tanaka@kddnet.ad.jp KDDI APAN NOC Session in Busan, Korea on 27 August 2003

  2. Agenda • Introduction to APAN JP Site NOC • Statistics of Trouble Tickets • Trouble analysis • Equipment in TokyoXP • TransPAC • Characteristics of Our Trouble • Proposal for improving Network Service Level

  3. APAN JP Site NOC

  4. APAN JP Site NOC: Location: • Physically located at the KDDI Otemachi Bldg 12F in Tokyo, and APAN Tokyo XP equipment is installed on the 5F Staff: • 24×7 Operators standby Operators are charged with additional operations for other networks • Scientific, Academic, Commercial Duties: • Opening and closing of Trouble Tickets • Receiving problem reports • Trouble shooting • Development and maintenance of measurement and operation tools

  5. KDDI Circuit Division Open View NNM Mail & Web Client Physical Layer Monitor APAN KDDI APAN KDDI 12F Operation Staff APAN Equipment 5F APAN JP Site NOC: Monitoring Environment NOC • HP Open View works independently in the NOC segment. • The NOC staff is utilizing Mail & Web clients enabling to detect alerts. • Physical Layer Monitor system of KDDI observes circuits. When any alerts are detected, • we can check the same status as KDDI Circuit Division.

  6. Statistics of Trouble Tickets

  7. Statistics of Trouble Tickets: • Objects • All trouble tickets issued by APAN JP NOC for the last 12 months (from 2002/Aug ~ 2003/July) • The total of tickets amount to about 200 tickets • Issue-selecting rules • Trouble • All the outages on TransPAC are covered. For others, outage of 15 minutes or more are covered. • Maintenance • All the maintenance works are covered (including such switch-hits over circuit within 1msec.)

  8. Statistics of Trouble Tickets: Trouble Tickets on Tokyo XP Fig1: Trouble Tickets on Tokyo XP

  9. Statistics of Trouble Tickets: Number of Monthly Tickets for Trouble/Maintenance Fig2: Number of Monthly Tickets/Maintenance

  10. Statistics Number of Monthly Tickets for Trouble Fig2: Number of Monthly Tickets for Trouble on Circuit/Equipment/Others/Unknown

  11. Statistics of Trouble Tickets: Number of Monthly Tickets for Maintenance Fig3: Number of Monthly Tickets for Maintenance on Circuit/Equipment

  12. Statistics of Trouble Tickets: Total Length of Time of Trouble/Maintenance of APAN Tokyo XP Fig4:Time Volume of Trouble/Maintenance of APAN Tokyo XP

  13. Statistics of Trouble Tickets: Total Availability of APAN Network Fig5: Total Availability of APAN Network

  14. Results of Trouble Tickets Statistics • The total numbers of trouble and maintenance almost equal to each other • The number of tickets varies mainly in response to circuit trouble and maintenance, which is obvious especially on TransPAC • Availability of the whole APAN network is 96.83%. (97.45% when maintenance is excepted from outage)

  15. Trouble Analysis

  16. Trouble Analysis: Trouble Tickets Classified by Area Fig6: Trouble Tickets Classified by Area

  17. Trouble Analysis: Total Outage Time Classified by Area Fig7: Total Outage Time Classified by Area

  18. Trouble Analysis: Average Outage Time Classified by Area Fig8: Average Outage Time Classified by Area

  19. Trouble Analysis: Number of Trouble Tickets by Trouble-occurring Area Routing trouble of TokyoXP Int’l circuit to TransPAC Equipment of TokyoXP Local circuit in China Equipment of PHnet Fig9: Number of Trouble Tickets by Trouble-occurring Area

  20. Trouble Analysis: Distribution by reason for Amount of Troubles Fig10 : Distribution by reason for Amount of Trouble

  21. Trouble Analysis: Distribution by Reason for Outage Time Fig11 : Distribution by Reason for Outage Time

  22. Equipment Trouble Analysis in TokyoXP

  23. Equipment Trouble Analysis in TokyoXP: Classification by Vender for TokyoXP Fig12: Classification by Vender for TokyoXP

  24. Equipment Trouble Analysis in TokyoXP: Classification by Software/Hardware for TokyoXP Fig13: Classification by Software/Hardware for TokyoXP

  25. Trouble Analysis on TransPAC

  26. Trouble Analysis on TransPAC: Fig14: Tickets Volume on Northern/Southern links Fig15: Total Outage Time on Northern/Southern links

  27. Trouble Analysis on TransPAC: Fig16:Ticket Volume on TransPAC links Classified by Circuit/Equipment Fig17: Total Outage Time on TransPAC links Classified by Circuit/Equipment

  28. Trouble Analysis on TransPAC: Fig18: Ticket Volume of Circuit Troubles on TransPAC links Classified by reason Fig19: Time Volume of Circuit Troubles on TransPAC links Classified by reason

  29. Trouble Analysis on TransPAC: Fig20: Ticket Volume of Equipment Troubles on TransPAC links Classified by reason Fig21: Time Volume of Equipment Troubles on TransPAC links Classified by reason

  30. Trouble Analysis on TransPAC: Availability of TransPAC • Northern link Availability = 99.819422% (Including trouble and maintenance) • Southern link Availability = 99.807319% (Including trouble and maintenance) • Total Availability = 100 - ( (100 - 99.819422) * (100 - 99.807319) ) = 99.999652% • Redundancy is achieved by the northern and southern links • Fortunately we have no outage at the same time! Fig22: Availability of TransPAC Northern link Fig23: Availability of TransPAC Southern link

  31. Characteristics of Our Trouble

  32. Characteristics of Our Trouble: Fig22: APAN Network Outages Table1: APAN Network Outages Minutes • Longest outage time per trouble 34:45:00 • Average outage time per trouble 2:32:09 Fig23: Distribution of APAN Network Outages by Length of Time

  33. Characteristics of Our Trouble: • 70% of all the troubles are cleared up within 60 minutes • Equipment troubles are noticeable, causing long outage time in many cases. • Utilizing housing sites and cooperation with venders are important • Domestic troubles are noticeable, but the average outage time is short Sharing trouble information internationally is defficult (Time zone, language) • Trouble occurring on lower layers such as Layer1(circuit) and Layer2(Ethernet switch) are noticeable. • Having redundant circuits and equipment, as seen on the TransPAC network, will be useful for shortening outage time.

  34. Proposal for Improving Network Service Level

  35. Proposal for Improving Network Service Level: • Shortening of trouble-handling time • Start trouble-handling and announce the information quickly • Operation tools which enabling us to issue trouble tickets automatically and announce information quickly. • Shorten trouble-shooting time • Remote trouble-shooting from other areas ( cf. Router Proxy on Global NOC) • These are under examination in TokyoXP • World Wide Information sharing • Installation of a shared information server Providing the following information • Performance and Operation status of the whole APAN network (cf. Animated Traffic map on Global NOC) • Trouble and Maintenance information • Syslog of routers in XPs and APs ※It is desirable that such a server should be installed on a commercial ISP, distant from the APAN networks.

  36. Proposal for Improving Network Service Level: • Redundant Network configuration • TransPAC links shows redundantconfiguration is very effective in realizing high availability. It is desirable that we establish redundant configuration as much as possible. • Monitoring of lower layers • For the operation of worldwide networks, it is very important to check the status of international circuits in cooperation with circuit carriers. • Possibility of using new Ethernet technologies eg, • BNDP – Bridge Neighbor Discovery Protocol • LFS - Link Fault Signaling (10GbE: 802.3ae)

More Related