450 likes | 465 Views
This presentation discusses the networking requirements and challenges faced by CERN in the era of the Large Hadron Collider (LHC), including updates on CERN connectivity, academic and research networking in Europe, the GEANT project, interconnections with global networks, and the DataTAG project. It also addresses the current state of the internet, growth, technologies, trends, and challenges such as QoS, gigabit/second file transfer, and security architecture.
E N D
Wide Area Networking requirements and challenges at CERN for the LHC era Presented at the NEC’01 conference, 17th September, VARNA, Bulgaria Olivier H. Martin CERN - IT Division September 2001 Olivier.Martin@cern.ch NEC'01 Varna (Bulgaria)
Presentation Outline • CERN connectivity Update • Academic & Research Networking in Europe • TEN-155 • GEANT project Update • Interconnections with Academic & Research networks worldwide • STAR TAP • STAR LIGHT • DataTAG project • Internet today • What it is • Growth • Technologies • Trends • Challenges ahead: • QoS • Gigabit/second file transfer • Security architecture NEC'01 Varna (Bulgaria)
Main Internet connections at CERN Swiss NationalResearchNetwork IN2P3 Mission Oriented & World Health Org. 2.5Gbps SWITCH 155Mbps WHO 45Mbps 1Gbps TEN-155 (39/155Mbps) Europe GEANT (1.25/2500Mbps) General purpose Internet connections (Europe/USA/World) CERN 155Mbps USA 1Gbps CIXP Commercial NEC'01 Varna (Bulgaria)
Telecom Operators & dark fibre providers: Cablecom, COLT, diAx, France Telecom, Global Crossing, GTS/EBONE, KPNQwest, LTT(*), Deutsche Telekom/Multilink, MCI/Worldcom, SIG, Sunrise, Swisscom (Switzerland), Swisscom (France), Thermelec. Internet Service Providers include: Infonet, AT&T Global Network Services (formerly IBM), Cablecom, C&W, Carrier1, Colt, DFI, Deckpoint, Deutsche Telekom, diAx (dplanet), Easynet, Ebone//GTS, Eunet/KPNQwest, France Telecom OpenTransit, Global-One, Globix, HP, INSnet/Wisper InterNeXt, ISDnet/Ipergy, IS Internet Services (ISION), LTT(*), Madge.web, Net Work Communications (NWC), PSI Networks (IProlink), MCI/Worldcom, Petrel, Renater, Sita/Equant(*), Sunrise, Swisscom IP-Plus, SWITCH, TEN-155, Urbanet, VTX, Uunet(*). CERN’s Distributed Internet Exchange Point (CIXP) isp isp Telecom operators c i x p isp isp isp isp isp isp CERN firewall Telecom operators Cern Internal Network NEC'01 Varna (Bulgaria)
CERN co-location status (Chicago) Qwest PoP (Chicago) Qwest-IP 21 Mbps / T3 KPN- Qwest CERN (Geneva) CERN-USA ESnet STM-1 CERNH9 STM-1 STM-1 T3 STAR TAP LS1010 STM-1 NEC'01 Varna (Bulgaria)
Academic & Research Networking in Europe • Focus on Research & Education, also includes access to the commodity Internet. • The TERENA compendium of NREN is an excellent source of information (e.g. funding, focus, budget, staff) • Hierarchical structure with pan-European backbone co-funded by the European Union (EU) interconnecting the various national networks. • TEN-155 (Trans-European 155 Mbps) • 155Mbps ATM core initially, recently upgraded to 622Mbps Packet over Sonet (POS) • still strong ATM focus • Managed Bandwidth Service (MBS) to support special needs & European Union projects. • Will terminate at the end of November 2001 • Map available at: http://www.dante.net/ten-155/ten155net.gif NEC'01 Varna (Bulgaria)
The GEANT Project • 4 years project started on December 1, 2000 • call for tender issued June 2000, closed September 2000, adjudication made in June 2001 • first parts to be delivered in September 2001. • Provisional cost estimate: 233 Meuro • EC contribution 87 MEuro (37%), of which 62% will be spent during year1 in order to keep NRN’s contributions stable (i.e. 36MEuro/year). • Several 2.5/10Gbps rings (mostly over unprotected lambdas, three main Telecom Providers (Colt/DT/Telia): • West ring (2*half rings) provided by COLT (UK-SE-DE-IT-CH-FR) • East legs and miscellaneous circuits provided by Deutsche Telekom (e.g. CH-AT, CZ-DE) NEC'01 Varna (Bulgaria)
The GEANT Project (cont.) • Access: mostly via 2.5 Gbps circuits • Routers: Juniper • CERN • CERN & SWITCH will share a 2.5 Gbps access • CERN should purchase 50% of the total capacity (i.e. 1.25 Gbps). • It is also expected that the access bandwith to GEANT will double every 12-18 months. • The Swiss PoP of GEANT will be located at the Geneva airport (Ldcom@Halle de Fret) NEC'01 Varna (Bulgaria)
The GEANT Project (cont.) • Connection to other World Regions • in principle via core nodes only, They will, together, form a European Distributed Access (EDA) “point” conceptually similar to the STAR TAP. • Projected Services & Applications • standard (i.e. best effort) IP service • premium IP service (diffserv’s Expedited Forwarding (EF)) • guaranteed capacity service (GCS), using diffserv’s Assured Forwarding (AF)) • Virtual Private Networking (VPN), layer2 (?) & layer 3 • Native Multicast • Various Grid projects (e.g. DataGrid) expected to be leading applications. • Use of MPLS anticipated for traffic engineering, VPN (e.g. DataGrid, IPv6). NEC'01 Varna (Bulgaria)
STAR TAP • Science, Technology And Research Transit Access Point • International Connection Point for Research and Education Networks at the Ameritech NAP in Chicago • Project goal: to facilitate the long-term interconnection and interoperability of advanced international networking in support of applications, performance measuring, and technology evaluations. • Hosts the 6 TAP - IPv6 Meet Point • http://www.startap.nethttp://www.startap.net/ NEC'01 Varna (Bulgaria)
STAR TAP (cont) • One of three Internet eXchange Points provided by AADS (Ameritech Advanced Data Services) out of a huge ATM switch, namely: • Chicago NAP • MREN (Metropolitan Research and Education Network), the local Internet2 GigaPoP. • STAR TAP • A by-product is a full mesh of ATM VC with ALL the connected ISPs, thus making it easy to establish peerings and/or to buy commercial Internet services (e.g. NAP.NET). • No transit issues because of its non-distributed nature. NEC'01 Varna (Bulgaria)
UC AADS ATM UIC ANL UIUC StarLight: The Optical STAR TAP SURF net BN STAR TAP Purdue Star Light OC-12 GigE NU Evanston iCAIR IUPUI NU Chicago GigE IU Bloom-ington CERN I-WIRE & Optical MREN CA*net4 Bell Nexxia (Chicago) ? This diagram subject to change NEC'01 Varna (Bulgaria)
The STAR LIGHT • Next generation STAR TAP with the following main distinguishing features: • Neutral location (Northwestern University) • 1/10 Gigabit Ethernet based • Multiple local loop providers • Optical switches for advanced experiments • GMPLS, OBGP • The STAR LIGHT will provide 2*622 Mbps ATM connection to the STAR TAP • Started in July 2001 • Also hosting other advanced networking projects in Chicago & State of Illinois NEC'01 Varna (Bulgaria)
StarLight Connections • STAR TAP (AADS NAP) is connected via two OC-12c ATM circuits now operational • The Netherlands (SURFnet) is bringing two OC-12c POS from Amsterdam to StarLight on September 1, 2001 and a 2.5Gbps lambda to StarLight on September 15, 2001 • Abilene will soon connect via GigE • Canada (CA*net3/4) is connected via GigE, soon 10GigE • I-WIRE, a State-of-Illinois-funded dark-fiber multi-10GigE DWDM effort involving Illinois research institutions is being built. 36 strands to the Qwest Chicago PoP are in. • NSF Distributed Terascale Facility (DTF) 4x10GigE network being engineered by PACI and Qwest. • NORDUnet will be using StarLight’s OC-12 ATM connection • CERN should come in March 2002 with OC-12 from Geneva. A second 2.5 Gbps research circuit is also expected to come during the second half of 2002 (EU DataTAG project). NEC'01 Varna (Bulgaria)
StarLight Infrastructure • …Soon, Star Light will be an optical switching facility for wavelengths NEC'01 Varna (Bulgaria)
Evolving StarLightOptical Network Connections Asia-Pacific SURFnet, CERN Vancouver CA*net4 CA*net4 Seattle Portland U Wisconsin NYC Chicago* PSC San Francisco IU DTF 40Gb NCSA Asia-Pacific Caltech Atlanta SDSC AMPATH *ANL, UIC, NU, UC, IIT, MREN NEC'01 Varna (Bulgaria)
DataTAG project • Main aims: • Ensure maximum interoperability between USA and EU Grid projects • Transatlantic test bed for advanced Grid applied network research • 2.5 Gbps circuit between CERN and StarLight (Chicago) • Partners: • PPARC (UK) • University of Amsterdam (NL) • INFN (IT) • CERN (Coordinating Partner) • Negotiation with EU is well advanced • Expected project start: 1/1/2002 • Duration: 2years. NEC'01 Varna (Bulgaria)
UK SuperJANET4 NL SURFnet GEANT It GARR-B DataTAG project NewYork Abilene STAR-LIGHT ESNET Geneva MREN STAR-TAP NEC'01 Varna (Bulgaria)
CERN–USA access requirements (2002) Abilene Japan STAR TAP vBNS Commodity Internet Canada 2*OC-12 CIXP ESnet T3 MREN Star Light E3 Production 622 Mbps CERN CERNPoP (USA) ANL FNAL 2.5 Gbps StarLight Co-location facility (NWU) DataTAG NEC'01 Varna (Bulgaria)
Internet Backbone Speeds MBPS IP/ OC12c OC3c ATM-VCs T3 lines T1 Lines NEC'01 Varna (Bulgaria)
IP IP ATM IP SONET/SDH SONET/SDH Optical Optical Optical IP Over Optical IP Over ATM IP Over SONET/SDH High Speed IP Network Transport Multiplexing, protection and management at every layer IP Signalling ATM SONET/SDH Optical B-ISDN Higher Speed, Lower cost, complexity and overhead NEC'01 Varna (Bulgaria)
E M U X Transmission Systems of The Recent Past Low-rate Data Low-rate Data 30-50 km E D M U X XMTR Regen. Repeater Regen. Repeater RCVR Regenerative Receiver Transmitter (DFB Laser) Opto-Electronic Regenerative Repeaters Electronic Multiplexer Electronic Demuliplexer • Single channel operation • Opto-electronic regenerative repeaters - one per 50 km per fiber • 30-50 km repeater spacing • Capacity upgrades: increased speed Still Found In Legacy Network Systems NEC'01 Varna (Bulgaria)
Today’s Transmission System l1 80-140 km XMTR RCVR l1 O M U X O D M U X XMTR RCVR l2 Regen. Repeater l2 ln XMTR RCVR ln Optical Demultiplexer Optical Multiplexer Optical Amplifiers • Multi-channel WDM operation • One amplifier supports many channels • 80-140km amplifier (repeater) spacing; regeneration required every 200-300 km • Capacity upgrades: adding wavelengths (channels) & increasing speeds However, regeneration is still very expensive and fixes the optical line rate NEC'01 Varna (Bulgaria)
Next Generation…The Now Generation l1 80-140 km XMTR l1 O M U X O D M U X RCVR XMTR l2 RCVR l2 ln XMTR RCVR ln 1600 km Optical Demultiplexer Optical Multiplexer • Multi-channel WDM operation • One amplifier supports many channels • 80-140km amplifier (repeater) spacing; regeneration required only every 1600 km • Capacity upgrades: adding wavelengths (channels) & increasing speeds Over 1000 Km optically transparent research network tested on the Qwest network NEC'01 Varna (Bulgaria)
IAB Workshop • The Internet Architecture Board (IAB) held a workshop on the state of the Internet Network Layer in July 1999, a number of problem areas and possible solutions were identified: • Network/Port Address Translators (NAT/PAT), • Application Level Gateways (ALG) and their impact on existing and future Internet applications. • End to end transport & security requirements (IPSEC) • Transparency (e.g. H.323) • Realm Specific IP (RSIP). • Mobility (completely different set of protocol requirements) • IPv6 • Routing (growth of routing table, route convergence) • DNS (renumbering) NEC'01 Varna (Bulgaria)
Loss of End to end transparency • Loss of end to end transparency due to: • proliferation of Firewalls, NATs, PATs • Web caches, Content Engines, Content Distribution Networks (CDN), • Application Level gateways, Proxies, etc. • Cons: • violation of end to end transport principle, • possible alteration of the data, • only partially fits the client-server model (i.e. server must be outside) • Pros: • better performance, service differentiation, SLA, • cheaper to deliver services to large number of recipients, etc. NEC'01 Varna (Bulgaria)
For web-based transactions: Sufficient to allow clients in private address spaces to access servers in global address space For telephones and I-Msg You need to use an address when you call them, and are therefore servers in private realm Client/Server Architecture is breaking down Private Address Realm Global Addressing Realm Private Address Realm NEC'01 Varna (Bulgaria)
Several major issues • Quality of Service (QoS) • High performance (i.e. wire speed) file transfer « end to end » • Will CDN technology help? • Is the evolution towards edge services likely to affect global GRID services? • Impact of security • Internet Fragmentation, one vs several Internets • e.g. GPRS top level domain • Transition to IPv6 and long term coexistence between IPv4 & IPv6 NEC'01 Varna (Bulgaria)
Quality of Service (QoS) • Two approaches proposed by the IETF: • integrated services (intserv), • intserv is an end-to-end architecture based on RSVP that has poor scaling properties. • differentiated services (diffserv). • diffserv is a newer and simpler proposal that has much better chances to get deployed in some real Internet Service Providers environments, at least. • even though diffserv has good scaling properties and takes the right approach that most of the complexity must be pushed at the edges of the network, there are considerable problems with large diffserv deployments. • ATM is far from dead, but has serious scaling difficulties (e.g. TEN-155, Qwest/ATM). • MPLS is extremely promising, today it looks like it is where the future lies (including ATM AAL5 emulation!) NEC'01 Varna (Bulgaria)
Quality of Service (QoS) • QoS is an increasing nightmare as the understanding of the implications are growing: • Delivering QoS at the edge and only at the edge is not sufficient to guarantee low jitter, delay bound communications, • Therefore complex functionality must also be introduced in Internet core routers, • is it compatible with ASICs, • is it worthwhile? • Is MPLS an adequate and scalable answer? • Is circuit oriented technology (e.g. dynamic wavelength) appropriate? • If so, for which scenarios? NEC'01 Varna (Bulgaria)
Internet Backbone Technologies (MPLS/1) • MPLS (Multi-Protocol Label Switching) is an emerging IETF standard that is gaining impressive acceptance, especially with the traditional Telecom Operators and the large Internet Tier 1. • Recursive encapsulation mechanism that can be mapped over any layer 2 technology (e.g. ATM, but also POS). • Departure from destination based routing that has been plaguing the Internet since the beginning. • Fast packet switching performed on source, destination labels, as well as ToS. Like ATM VP/VC, MPLS labels only have local significance. • Better integration of layer 2 and 3 than in an IP over ATM network through the use of RSVP or LDP (Label Distribution Protocol). • Ideal for traffic engineering, QoS routing, VPN, IPv6 even. NEC'01 Varna (Bulgaria)
Internet Backbone Technologies (MPLS/2) • MPLS provides 2 levels of VPNs: • Layer 3 (i.e.conventional VPNs) • Layer 2 (i.e encapsulation of various layer2 frame formats), e.g. • Ethernet • ATM • PPP • MPLS can also be used for circuit and/or wavelength channel restoration. • MPlS (MP”Lambda”S), GMPLS (Generalized MPLS) NEC'01 Varna (Bulgaria)
Gigabit/second networking • The start of a new era: • Very rapid progress towards 10Gbps networking in both the Local (LAN) and Wide area (WAN) networking environments are being made. • 40Gbps is in sight on WANs, but what after? • The success of the LHC computing Grid critically depends on the availability of Gbps links between CERN and LHC regional centers. • What does it mean? • In theory: • 1GB file transferred in 11 seconds over a 1Gbps circuit (*) • 1TB file transfer would still require 3 hours • and 1PB file transfer would require 4 months • In practice: • major transmission protocol issues will need to be addressed (*) according to the 75% empirical rule NEC'01 Varna (Bulgaria)
Very high speed file transfer (1) • High performance switched LAN assumed: • requires time & money. • High performance WAN also assumed: • also requires money but is becoming possible. • very careful engineering mandatory. • Will remain very problematic especially over high bandwidth*delay paths: • Might force the use Jumbo Frames because of interactions between TCP/IP and link error rates. • Could possibly conflict with strong security requirements (i.e.throughput, handling of TCP/IP options (e.g. window scaling)) NEC'01 Varna (Bulgaria)
Very high speed file transfer (2) • Following formula proposed by Matt Mathis/PSC (“The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm”) to approximate the maximum TCP throughput under periodic packet loss: (MSS/RTT)*(1/sqrt(p)) • where MSS is the maximum segment size, 1460 bytes, in practice,and “p” is the packet loss rate. • Are TCP's "congestion avoidance" algorithms compatible with high speed, long distance networks. • The "cut transmit rate in half on single packet loss and then increase the rate additively (1 MSS by RTT)" algorithm may simply not work. • New TCP/IP adaptations may be needed in order to better cope with “lfn”, e.g. TCP Vegas NEC'01 Varna (Bulgaria)
Very high speed file transfer (3) • The Mathis formula shows the extreme variability of achievable TCP throughputs in the presence of, • even small, packet loss rates (i.e. less than 1%), • Small packets vs large packets (e.g. Jumbo frames), • Delay (RTT), also called long fat networks (lfn), i.e. with large bandwidth*delay products, hence the need for very large windows: • 3.3MB over 155Mbps link to Caltech and 170ms RTT. • and 53MB over 2.5Gbps to Caltech! • Consider a 10Gbs link with a RTT of 100ms and a TCP connection operating at 10Gbps: • the effect of a packet drop (due to link error) will drop the rate to 5Gbs. It will take 4 *MINUTES* for TCP to ramp back up to 10Gbps. • In order to stay in the regime of the TCP equation, 10 Gbit/s for a single stream of 1460 byte segments, a packet loss rate of about 1E-10 is required • i.e. you should lose packets about once every five hours. NEC'01 Varna (Bulgaria)
Acceptable link error rates NEC'01 Varna (Bulgaria)
Very high speed file transfer (tentative conclusions) • Tcp/ip fairness only exist between similar flows, i.e. • similar duration, • similar RTTs. • Tcp/ip congestion avoidance algorithms need to be revisited (e.g. Vegas rather then Reno/NewReno). • Current ways of circumventing the problem, e.g. • Multi-stream & parallel socket • just bandages or the practical solution to the problem? • Web100, a 3MUSD NSF project, might help enormously! • better TCP/IP instrumentation (MIB) • self-tuning • tools for measuring performance • improved FTP implementation • Non-Tcp/ip based transport solution, use of Forward Error Corrections (FEC), Early Congestion Notifications (ECN) rather than active queue management techniques (RED/WRED)? NEC'01 Varna (Bulgaria)
CERN’s new firewall: technology and topology Gbit Ethernet Cabletron SSR Gbit Ethernet Fast Ethernet FastEthernet DxmonFE and FDDI+bridge CiscoPIX Cisco RSP7000 FastEthernet 100/1000 Ethernet FastEthernet Cabletron SSR Securitymonitor NEC'01 Varna (Bulgaria) Gbit Ethernet