150 likes | 259 Views
Toward Stable Operation of Genkai/Hyeonhae/APII link. Yoshinori Kitatsuji APAN Tokyo XP KDDI R&D Laboratories, Inc. Overview. Test before starting service Checking Availability Staff in Service Routes Link Utility Bandwidth availability Troubleshoot Tools status to public
E N D
Toward Stable Operationof Genkai/Hyeonhae/APII link Yoshinori Kitatsuji APAN Tokyo XP KDDI R&D Laboratories, Inc.
Overview • Test before starting service • Checking Availability • Staff in Service • Routes • Link Utility • Bandwidth availability • Troubleshoot • Tools • status to public • Future arrangement • Resource management • Scheduling • Project • Maintenance • Making requirement clear • Protocol, Path, Traffic bandwidth, Schedule
Test before starting service • Understanding availability is much important • Spec. written in Catalogue doesn't’ match with actual performance sometime. • Combination of gears usually perform like the greatest common measure. • It’s comfortable for both users and operators to use networks if the limitation is know well. • It helps scheduling upgrade network and enhancing the performance. • Performance known at the moment • Meinohama-Busan • 930Mbit/s (IP packet, 1000 byte length) • Meinohama-Tokyo • 250Mbit/s (UDP/TCP),because shaper (271Mbit/s) applied • 430Mbit/s (TCP over a loop Tokyo-Tenjin-Tokyo) • Should be known how much improved • between QGPOP and Seoul • between Seoul and Tokyo
Genkai Network Configuration at L3 Busan XP Tokyo XP (KDDI Otemachi) AS 9270 ATM/OC12 tpr3 tpr2 CiscoGSR 430Mbit/s 106 161 194 169 AS 7660 203.181.249.160/29 203.181.249.104/30 AS 7660 203.181.249.184/29 203.181.249.168/29 203.181.248.192/30 185 162 APII-Juniper 250Mbit/s ATM/OC3 105 170 GbE 177 GbE fidc-juniper Kyushu U AS 2523 203.181.249.176/29 133.69.164.0/28 178 193 13 GbE 6 hakozaki-cisco2 AS 2523 GbE AS 2907 Genkai XP (Meinohama) GbE hakozaki-juniper SuperSINET 133.69.152.0/26 15 Presented in NOC meeting, APAN Fukuoka Meeting
APII Juniper Configuration of Loopback test OC12 Orenge: ATM Switch Blue: Ethernet Switch CRL OC12 JGN ATM network PC CRL OC12 APAN Tokyo XP APAN Tokyo XP OC12 GigabitEthernet OC12 CRL OC12 OC12 PC
Staff prepared in service • Making acceptable usage policy clear • Simple and Opened AUP helps for operators to manage trouble. • Monitoring in APAN Tokyo XP • Traffic, Number of routes • MRTG, RRDTool, OCXmon/DAG • Reachability • Ping and report for major targets every 5minutes • Performance • SLAC’s performance monitor using iperf • Opening operational information to public • Assignment information • IP address, ATM PVC, VLAN ID, Port utility of equips • Monitoring result
MRTG • Multi Router Traffic Grapher • Most common tool for operation • Plots 2 data source against time • Use SNMP to collect counters of interfaces • 5min average and 2 data sources in a figure • Advantages • Ability to create scripts to feed data into MRTG • Mature code because of very wide deployment in large networks • Any variable can be graphed • Easy to understand configuration • Disadvantages • Can’t create custom graph periods • Only keeps 5 minute data for 2.5 days, after witch it is aggregated to longer period. • Too hard to hack to modify higher late than 5 minutes
RRDTool • Round Robin Database Tool • Successor of MRTG • Ability to customize graphs across user-defined interval • Constant file size independs on the number of feeding time because of binary file format • Advantages • 25-30 percent faster than MRTG • Ability to have multiple data sources in a single graph. • Ability to generate graph from multiple data files. • Disadvantages • No collector • Doesn’t create graph and web-pages without additional scripts • Hard to merge multiple files into one.
CoralREEF/DAG • TCPDUMP for ATM/POS • Packet capturing system • POS OC48 is under development • Unknown how much perform actually • CoralREEF • Successor of OC3mon developed by MCI & NLANR • run with Fore PCI200 or Point ATM card • DAG • developed by Wikato University in New Zealand • run with DAG card developed by Wikato University • Advantages • Header or whole packets are captured • Ability to convert original format to TCPDUMP (pcap library) • Disadvantages • Run on only FreeBSD or Linux • Splitter divide the laser and power degrades to half • Need multiple PCs to capture a high speed link. • May need GPS clock than NTP to get high accuracy • Need interface to generating graphs continuously like MRTG Router ATM Switch
Here operation started Here Genkailink operation started Traffic since … Hyeonhea/Genkai Link TransPAC link to Seattle
SLAC performance monitor Here operation started • TCP performance monitor • Generate TCP traffic for 1-2 minutes. • Measure RTT and bandwidth of • iperf, BBCPMEM, BBFTP • http://perf3-fe.jp.apan.net/~cottrell/html/slac_wan_bw_tests.html • Advantage • requires only ssh and iperf • multiple targets are measured round robin every two hours automatically • Disadvantage • Generated traffic effects regular traffic. • Takes some time (2-3minutes) to measure. APAN Tokyo XP Stanford Univ AIST QGPOP
Open Operation • Opening assignment policy and operation status • Reducing query • Make troubleshoot easier • Sharing account between neighbor networks • Consider to cause the lack of security, but fruitful. • Ability to understand neighbor networks deeply • Account holder do troubleshoot by themselves • Opening status of trouble • Users and neighbor networks feel relieved. • May have help from outside
Troubleshoot • Tool • ping/traceroute for unicast • mtrace, multicast beacon and debug output for multicast • MRTG or more accurate graph • Sharing information • Maintenance report • schedule, status, and after maintenance • Traceable status board • Make any information opened as much as possible • Neighbor may do troubleshoot
Multicast Beacon • Every beacon sends and receives periodical packets to some group and all of them make reports the statistics to MB server. • Developed by NLANR • http://dast.nlanr.net/Projects/Beacon/ • Popular in AccessGrid • MB server in XP may helps to debug the multicast trouble for especially seeking where problem lie.
Future operation • Resource Scheduling will become important • Now we can generate 100s Mbit/s speed traffic between Korea and Europe through Japan and USA • More important to schedule events than sharing resources over events • Multiple Traffic easily break each other. • Will take more time to support the Events • Demonstration will requires the high performance • Ex. AIST tried to record high speed TCP transmission test with 2 TransPAC links in parallel in SC2002. • Need to Collect detail of requirement • Regular traffic increase gradually • Installing Tools and Devices • XPs are required to equips tools and devices to support all kind of services run by projects