630 likes | 1.14k Views
Experiences with setting up the CHEETAH network. Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia. CHEETAH network overview. Wide-area circuits. OC-192 lambda from MCNC between MCNC and NLR Raleigh PoP (Cisco 15454 MSTPs)
E N D
Experiences with setting up the CHEETAH network Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia
Wide-area circuits • OC-192 lambda from MCNC between MCNC and NLR Raleigh PoP (Cisco 15454 MSTPs) • OC-192 lambda from NLR between NLR Raleigh PoP and NRL Atlanta PoP (Cisco 15808 MSTPs) • OC-192 lambda from SLR between NLR Atlanta PoP and SLR/Sox/Telx Atlanta PoP (Movaz) • 2x1GbE MPLS tunnels from ONRL between SLR/Telx Atlanta PoP and ORNL • Will be upgraded to OC-192 lambda (Ciena Corestream) • Purchasing issues: • Wide-area circuits not “capital equipment”
CHEETAH nodes (circuit gateways) • Cisco ONS 15454 MSPP in original proposal • Bought 10/100 Ethernet, GbE, and OC-3 interface cards • Watch out for the differences between the three types of Ethernet cards (E-series, ML-series, G-series) • Powerful TL1/CTC interfaces • Does not support dynamic circuit setup through GMPLS. Need external signaling software in order to be deployed in CHEETAH network (said support for UNI – but this is from client side) • Now being used in our lab with two Cisco GSR 12008 routers for local-area CHEETAH research
The circuit gateway • SDM to TDM • More specifically, Ethernet/VLAN at user side to SONET at CHEETAH network side • Implement GFP and VCAT to map Ethernet signals into SONET signals in a an efficient manner • For example, 1Gbps -> 21xOC-1 • Why GMPLS • Value of network grows exponentially with number of endpoints (Metcalfe's law) • CHEETAH network aims to a wide range of applications instead of just scientific applications. • Need distributed, Dynamic control-plane to build a scalable network.
CHEETAH nodes • Sycamore SN16000 intelligent optical switch • GbE, 10GbE, and SONET interface cards (we have OC192s) • Networks/nodes are managed through Silvx NMS, limited support for CLI and TL1 interface • Switch OS - BroadLeaf – implements GMPLS protocols since Release 7.0 • Pure SONET circuits – excellent GMPLS signaling and routing support. • Ethernet-to-SONET GMPLS not officially released yet. • But proprietary solution works!
CHEETAH nodes • Sycamore SN16000 intelligent optical switch • Before purchasing, a signaling interop. testing was conducted between SN16000 and our RSVP-TE client software at Chelmsford, MA. • Three SN16000s were purchased without Silvx management software (would recommend getting Silvx NMS) • Promising future upgrade of signaling capability • Ethernet-to-SONET is already available in unofficial patches • LMP, L2SC, unidirectional circuits, CSPF, UNI, etc. • Purchasing issues: SONET switch equipment vendors’ contracts quite elaborate • Very different from data equipment • Couldn’t reach agreement between UVA (state regulations) and Sycamore Networks!
CHEETAH network connection • Three SN16000s were purchased and are located in MCNC, SLR/SoX/Telx Atlanta PoP, and ORNL. • SN16000s are interconnected by OC-192 circuits • End hosts are connected to SN16000s’ GbE interfaces by direct fiber, VLAN or MPLS tunnels. • VLAN & MPLS tunnels not in original proposal • But see a clear need for these - for costs reasons • Testing shows packet loss/out-of-sequence on these segments!
NC configuration – User Plane Orbitty Compute Nodes 1G 10G Ethernet switch Compute-0-4 152.48.249.6 H 1G Compute-0-3 152.48.249.5 H 1G Compute-0-2 152.48.249.4 H 1G Compute-0-1 152.48.249.3 H 1G Compute-0-0 152.48.249.2 H NCSU 5Gbps VLAN MCNC OC192 OC192 GbE 1G 10G Ethernet switch 1-8-33 1-6-1 1-7-1 To Atlanta 1G 1-8-34 1G 1-8-35 1G 1-8-36 1-6-17 1-7-17 1G 1-8-37 1-8-38 1G cheetah-nc H 152.48.249.102 Direct fibers wukong VLAN connections
GbE OC192 1-7-33 1-6-1 1-7-34 1-7-35 1-7-36 1-7-37 1-6-17 1-7-38 1-7-39 Atlanta configuration – User Plane 10GbE 1G Zelda1 10.0.0.11 H To NC 1G Zelda2 10.0.0.12 H 1-7-1 1G Zelda3 10.0.0.13 H 1G Juniper router 1G Cheetah-atl Atlanta ORNL GaTech 2x1GbE MPLS tunnels 1GbE 1G Zelda4 10.0.0.14 Juniper router H 1G To GaTech Zelda5 10.0.0.15 H Direct fibers MPLS tunnels
Things learned during node installation and maintenance • Rack installation • Physical dimension: 19” or 23” width, height, depth? • two-post or four-post rack? • Not just the SONET gateway. Need space for other equipment, such as PCs, Ethernet switches, console servers, PDU, etc. • Power supply • Need careful power calculations – allow for growth • DC power for switch: voltage, current, etc. • AC for other equipment: current, connector type, etc. • Remote power management • Need the capability to remotely (through the Internet) power cycle equipment. • Switched PDU (Power Distribution Unit) for AC power
Things learned during the node installation and maintenance • Remote management • Internet access • Console server • Connect the serial port on SN16000s to PCs to allow remote management when Ethernet management port on a SN16000 is down • Need better specification of remote manual support from collocation service providers • Network security • Protect the switch from Internet attacks • Provide the integrity and authentication of control-plane traffic • Our solution: • Juniper Netscreen-5XT firewall/VPN server (hardware) for SN16000s • Openswan software for Linux end hosts. • CHEETAH control-plane traffic is protected by IPsec tunnels • These tunnels allow for the use of private IP addresses behind the firewall device • Other considerations • Network measurement • Need Ethernet hub (old style) or high-end Ethernet switch
Things learned for CHEETAH end hosts • End hosts • CPU: frequency, single or dual, cache size • Bus speed • Memory: size, speed • Disk: volume, speed, SATA or SCSI, Raid • GbE Interface Card (NIC) • Optical or copper • Bus type: PCI, PCI-X • Connector: SC, LC • Protocol: 1000Base-T, SX, LX (need to match the protocol of Ethernet interfaces on CHEETAH nodes) • Operating system: Windows, Linux, Unix, etc. • Linux kernel version • Software • Security software: VPN client, ssh/ssl library • Development tools: gcc compiler • Network tools: Iperf
Control-plane design • More difficult than user-plane design • Must consider requirements of GMPLS protocols, security, robustness • DCC-inband or out-of-band signaling? • Must do out-of-band because of end hosts involved. • Security • Control-plane traffic must be protected • Private or public IP addresses • Part of the routing domain on the Internet or endpoints?
NC configuration – Control Plane Orbitty Compute Nodes Compute-0-0 128.109.45.160 H Compute-0-1 128.109.45.161 H Internet Compute-0-2 128.109.45.162 H Compute-0-3 128.109.45.163 H Compute-0-4 128.109.45.164 H NCSU MCNC OC-192 TE link to ORNL16K1 Control OC192 OC192 GbE Internet Ethernet port 1-8-33 1-6-1 1-7-1 1-8-34 1-8-35 5 x 1Gbps TE link to orbitty 128.109.34.18 1-6-17 1-7-17 128.109.34.19 1-8-36 NS-5 1-8-37 192.168.4.2 1Gbps TE link 1-8-38 cheetah-nc H 128.109.34.20 wukong Router ID=switch IP =192.168.5.1
GbE 1-7-33 1-7-34 1-7-35 1-7-36 1-7-37 1-7-38 1-7-39 Atlanta configuration – Control Plane OC-192 TE link to cheetah-nc 10GbE OC192 Control Ethernet port 3x1Gbps TE link Zelda1 130.207.252.131 H 1-6-1 130.207.252.138 1-7-1 Zelda2 130.207.252.132 H 130.207.252.136 Zelda3 130.207.252.133 H 1-6-17 NS-5 192.168.2.2 Cheetah-atl Router ID=switch IP=192.168.3.1 Atlanta Internet ORNL 1Gbps TE link NS-5 Zelda5 10.1.1.5 H 198.124.42.3 Zelda4 10.1.1.4 H
End-to-End GMPLS Signaling in CHEETAH Project Xiangfei Zhu xzhu@cs.virginia.edu 9/1/2005
Outline • Optical Network signaling overview • Sycamore SN16000 and Cisco 15454 • End host software for GMPLS signaling • External GMPLS signaling Engine for Cisco 15454 • Demo • Conclusion and future work
Optical Network Signaling • GMPLS: Work in progress at IETF • RFC2205 – RSVP for IP network • RFC3209 – RSVP-TE for MPLS • RFC3471 & RFC3473 – RSVP-TE for GMPLS • RFC3946 – SONET and SDH • Optical Internetworking Forum (OIF) • UNI, I-NNI, E-NNI • International Telecommunications Union (ITU) • G.8080 (G.ASON)
Vendor Support to GMPLS • Some vendors provide varying-level support to GMPLS in their products • E.g.: CIENA CoreDirector, Sycamore SN16000, etc. • Successful multi-vendor GMPLS interoperability demos at ISOCORE and Supercomputing 05. • The implementation of GMPLS signaling by different vendors are basically compatible
University GMPLS Code • KOM RSVP Engine – Technische Universitat Darmstadt [7] • partial support of RFC 2205, 2210 & 3209 • Dragon RSVP-TE code – MAX/ISI[4] • partial support of RFC 3471, 3473 & 3946 • Integrate with OSPF-TE
Work at UVA • Implement a GMPLS software for end host – End host RSVP-TE client • Integrating GMPLS signaling with • Admission control • OCS (Optical Connectivity Service) (Not fully done) • Integrate with 15454 control software • Interoperability test with Sycamore SN16000 • Add support to CAC and route computation to VLSR (work at CUNY)
SN16000: Optical Control Plane Features • GMPLS RFCs and Drafts • RFC 3471: GMPLS Signaling Functional Description • RFC 3473: GMPLS Signaling RSVP-TE Extensions • (Draft) Routing Extensions in Support of GMPLS • (Draft) OSPF Extensions in Support of GMPLS • (Draft) Generalized Multi-Protocol Label Switching Architecture • (Draft) GMPLS Extensions for SONET and SDH Control • (Draft) Framework for GMPLS-based Control of SDH/SONET Networks • In-fiber (in-band) and Out of fiber (out of band) Control Plane • Fiber-Switch Capable Support • enables communication w/FSC devices, (in addition to TDM devices) • OIF UNI 1.0 Slide from Sycamore
Cisco 15454 • Cisco SONET Multiservice Provisioning Platform (MSPP) • Doesn’t support GMPLS signaling
Routing decision Routing decision End host Context Internet [3] PC PC FTP FTP TCP/IP TCP/IP NIC I NIC I RSVP-TEClient RSVP-TE Client FRTP NIC II NIC II FRTP CHEETAH Network • Routing decision: Decide use CHEETAH circuit or Internet to transfer the file base on the Internet congestion status and file size • FRTP: Fixed-Rate Transport Protocol designed for circuit-switched network[6] • RSVP-TE client: Dynamic provision of the circuit
bwrequestor • Command for end users to request a circuit. • bwrequestor DESTINATION-DOMAIN-NAME BANDWIDTH
bwmgr Daemon • Read the configuration from a configuration file. • Initiate circuit setup. • Accept the circuit request • Check if destination is in CHEETAH network (OCS) • Bandwidth management (CAC) • Create RSVP session and send out PATH message • Update ARP/IP table for data-plane • Accept the circuit setup requests • Register a default session with the RSVPD and listen to PATH message • If it is a new session • Fork a new process, create a new session, update CAC table, and send back RESV message • Update ARP/IP table for data-plane
Bwmgr Configuration File • The configuration file includes: • The control-plane address of the node • The address of the edge switch the node is connected to • TE-link information: • Local data-plane interface • Link type (Ethernet/SONET) and bandwidth • Interface types (numbered or unnumbered) of the two interfaces (local and remote) • IP (numbered interface) / IFID (unnumbered) of each interface • A sample configuration file CTRL-PLANE-IP = 130.207.252.131 EDGE-ROUTER-IP = 192.168.2.2 # TE-Links # TELink data-plane interface link type (0-Ethernet, 1-SONET) bandwidth(unit: Mbit) local interface type (0-unnumbered, 1-numbered) local interface IP/ID remote interface type remote interface IP/ID TELink eth2 0 1000 0 1 0 1
Bwmgr Library • Provide two interfaces: • BWRequest() – setup circuits • BWTeardown() – teardown circuits • Can be easily integrated with user applications
External GMPLS Engine for Equipment without GMPLS Capability • Dragon’s VLSR (Virtual Label Switching Router)[4] as an external GMPLS engine. • RSVP-TE message parsing and construction • Fabric programming module for some Ethernet switches through SNMP • Adopt VLSR for Cisco 15454 • Monfox TL1 Library • Provides an interface for an external program to provision circuits by issuing TL1 commands to 15454 • Difficulty: Library in Java while the Dragon code is in C++ • Figured out how to integrate Java code with C++ through CNI (Cygnus Native Interface) (by Lingling Cui) • Integrate Dragon’s RSVP-TE software with 15454 control software
External GMPLS Engine for CISCO 15454 RESV PATH
GbE OC192 1-7-33 1-6-1 1-7-34 1-7-35 1-7-36 1-7-37 1-6-17 1-7-38 1-7-39 Demo Internet NS-5 10GbE Control Control-Plane: 130.207.252.131 Data-Plane 10.0.0.11 1G H Zelda1 1G Zelda2 H Ethernet port 1-7-1 1G Zelda3 H 1G Juniper router 1G 192.168.2.2 1G To ORNL Atlanta Control OC192 GbE Internet 1-8-33 1-6-1 To Orbitty 1-8-34 Ethernet port 1-8-35 1-6-17 NS-5 1-8-36 192.168.4.2 1-8-37 H Control-Plane: 128.109.34.20 Data-Plane 152.48.249.102 1-8-38 Wukong
Performance • Average end-to-end setup delay is around 4.5 seconds • Detailed delay has not being measured
Performance • Performance of external GMPLS engine for MSPP[5] • Time for crossconnection setup: • STS-1: 17.833 ± 0.184 ms • STS-3: 18.000 ± 0.081 ms • Time for crossconnection delete: • STS-1: 16.400 ± 0.175 ms • STS-3: 16.300 ± 0.145 ms
Conclusion • It is feasible to extend dynamic circuits to end hosts by running RSVP-TE software on end hosts • It is feasible to add GMPLS signaling capability to devices without build-in GMPLS capability • The standards are mature and vendor implementation is good
Future Work • Development part • Alpha and Beta test • Hand out to scientists to use • Finish the developing of VLSR at CUNY • Research part • Bandwidth scheduling of circuit-switched network • Immediate call vs. scheduled call • Distributed bandwidth scheduling
Reference [1] http://cheetah.cs.virginia.edu/ [2] CHEETAH overview, John H. Moore, Xuan Zheng, Malathi Veeraraghavan, http://cheetah.cs.virginia.edu/networks/Cheetah%20Overview.jpg [3] CHEETAH network, Malathi Veeraraghavan, Nagi Rao, July 7, 2004 [4] http://dragon.east.isi.edu/ [5] External Switch Control Software, Lingling Cui, CHEETAH project year 1 demo, September 01, 2004 [6] X. Zheng, A. P. Mudambi, and M. Veeraraghavan, FRTP: Fixed Rate Transport Protocol -- A modified version of SABUL for end-to-end circuits, Pathnets2004 on Broadnet2004, Sept. 2004, San Jose, CA [7] KOM RSVP Engine, http://www.kom.e-technik.tu-darmstadt.de/rsvp/ [8] Monfox DynamicTL1 SDK, http://www.monfox.com/dtl1_sdk.html
Acronym • CHEETAH – Circuit-switched High-speed End-to-End Transport ArcHitecture • RSVP – Resource Reservation Protocol • RSVP-TE – RSVP – Traffic Engineering • GMPLS – Generalized Multiple Protocol Label Switching • SONET – Synchronous Optical NETwork • SDH – Synchronous Digital Hierarchy • IETF – Internet Engineering Task Force • RFC – Requests for Comments • UNI – User-Network Interface • I-NNI – Internal-Network-Network Interface • E-NNI – External-Network-Network Interface
Outline • Transport protocol functionality • Requirements for dedicated circuits • User-space implementation • Kernel-space implementation
Transport protocols • End-to-end functionality falls in transport layer’s domain • Error control • Congestion control • Flow control
Error control • End-to-end reliability; no missing data, no reordering, no duplicates • Data packets get lost (thrown away due to lack of buffer space) • Bit errors cause packet corruption in transit • Error control: Infer that something is wrong and take steps to correct it. • Retransmit missing packets, buffer out of order data until the missing data arrives, suppress duplicates
Congestion control • Avoid or reduce the damage due to buffers in the network overflowing • Inferring that network is congested: # explicit signals from the node where buffer overflows # guess from indirect signals like packet loss, RTT variation • To reduce congestion slow the rate of putting packets into the network
Flow control • Avoid over-running the receiver • If receiver is the bottleneck maintain a sending rate that matches the receiver rate • Receiver signals its willingness to accept more data • Sender sends data if receiver is ready
Transport protocol for dedicated circuits • Error control: Errors can occur so error control is needed • Congestion control: Bandwidth reserved in network. No congestion if sender always sends below reserved circuit rate. • Explicit congestion signals: will never come so no problem; infer congestion from packet loss: loss due to other reasons wrongly assumed to signify congestion • Flow control: Ensure no receiver over-run so flow control is required
User-space implementation • UDP: Minimal transport protocol. No error/congestion/flow control • UDP: Interface to the IP layer. Low overhead • Perfect for adding extra functionality as needed • Many variations of UDP-based protocols: SABUL, Hurricane, Tsunami, UDT …
User-space implementation • We took SABUL and modified it for our needs • Simple Available Bandwidth Utilization Library • Uses UDP for the data transfer • Adds reliability, congestion/flow control by using a control channel that runs over TCP