210 likes | 229 Views
The EU DataTAG project focuses on Grid applied network research and interoperability between Grids. It aims to hide the complexity of Wide Area Networking and improve interoperability between Grid projects in Europe and North America. The project is expected to have outcomes for DataGrid and other EU funded Grid projects.
E N D
The EU DataTAG Project Presented at the GGF3 conference 8th October Frascati, Italy Olivier H. Martin CERN - IT Division
The EU DataTAG project • Two main focus: • Grid applied network research • Interoperability between Grids • 2.5 Gbps transatlantic lambda between CERN (Geneva) and StarLight (Chicago) • dedicated to research (no production traffic) • Expected outcomes: • Hide complexity of Wide Area Networking • Better interoperability between GRID projects in Europe and North America • DataGrid, possibly other EU funded Grid projects • PPDG, GriPhyN, DTF, iVDGL (USA) The EU DataTAG Project
The EU DataTAG project (cont) • European partners: INFN (IT), PPARC (UK), University of Amsterdam (NL) and CERN, as project coordinator • Significant contributions to the DataTAG workplan have been made by Jason Leigh (EVL@University of Illinois), Joel Mambretti (Northwestern University), Brian Tierney (LBNL). • Strong collaborations already in place with ANL, Caltech, FNAL, SLAC, University of Michigan, as well as Internet2 and ESnet. • The budget is 3.9 MEUR • Expected starting date: December, 1, 2001 • NSF support through the existing collaborative agreement with CERN (Eurolink award) The EU DataTAG Project
UK SuperJANET4 NL SURFnet GEANT IT GARR-B DataTAG project NewYork Abilene STAR-LIGHT ESNET CERN MREN STAR-TAP The EU DataTAG Project
DataTAG planned set up(second half 2002) DataTAGtest equipment CERN PoP Chicago DataTAGtest equipment DataTAGtest equipment STARLIGHT UvA INFN PPARC …... DataTAGtest equipment ESNET GEANT ABILENE 2.5 Gb DataGRID PPDG iVDGL CERN GriPhyN DTF CIXP DataTAG test equipment The EU DataTAG Project
DataTAG Workplan • WP1: Provisioning & Operations (CERN) • Will be done in cooperation with DANTE • Two major issues: • Procurement • Routing, how can the DataTAG partners have transparent access to the DataTAG circuit across GEANT and their national network? • WP5: Information dissemination and exploitation (CERN) • WP6: Project management (CERN) The EU DataTAG Project
DataTAG Workplan (cont) • WP2: High Performance Networking (PPARC) • High performance Transport • tcp/ip performance over large bandwidth*delay networks • Alternative transport solutions • End to end inter-domain QoS • Advance network resource reservation The EU DataTAG Project
DataTAG Workplan (cont) • WP3: Bulk Data Transfer & Application performance monitoring (UvA) • Performance validation • End to end user performance • Validation • Monitoring • Optimization • Application performance • Netlogger The EU DataTAG Project
DataTAG Workplan (cont) • WP4: Interoperability between Grid Domains (INFN) • GRID resource discovery • Access policies, authorization & security • Identify major problems • Develop inter-Grid mechanisms able to interoperate with domain specific rules • Interworking between domain specific Grid services • Test Applications • Interoperability, performance & scalability issues The EU DataTAG Project
DataTAG Planning details • The lambda availability is expected in the second half of 2002 • Initially, test systems will be either at CERN or connect via GEANT • GEANT is expected to provide VPNs (or equivalent) for Datagrid and/or access to the GEANT PoPs. • Later, it is hoped that GEANT will provide dedicated lambdas for Datagrid • Initially a 2.5 Gb/sec POS link • WDM later, depending on equipment availability The EU DataTAG Project
The STAR LIGHT • Next generation STAR TAP with the following main distinguishing features: • Neutral location (Northwestern University) • 1/10 Gigabit Ethernet based • Multiple local loop providers • Optical switches for advanced experiments • The STAR LIGHT will provide 2*622 Mbps ATM connection to the STAR TAP • Started in July 2001 • Also hosting other advanced networking projects in Chicago & State of Illinois N.B. Most European Internet Exchanges Points have already been implemented along the same lines. The EU DataTAG Project
StarLight Infrastructure • …Soon, Star Light will be an optical switching facility for wavelengths The EU DataTAG Project
Evolving StarLightOptical Network Connections Asia-Pacific SURFnet, CERN Vancouver CA*net4 CA*net4 Seattle Portland U Wisconsin NYC Chicago* PSC San Francisco IU DTF 40Gb NCSA Asia-Pacific Caltech Atlanta SDSC AMPATH *ANL, UIC, NU, UC, IIT, MREN The EU DataTAG Project
Multiple Gigabit/second networking Facts, Theory & Practice (1) • FACTS: • Gigabit Ethernet (GBE) nearly ubiquitous • 10GBE coming very soon • 10Gbps circuits have been available for some time already in Wide Area Networks (WAN). • 40Gbps is in sight on WANs, but what after? • THEORY: • 1GB file transferred in 11 seconds over a 1Gbps circuit (*) • 1TB file transfer would still require 3 hours • and 1PB file transfer would require 4 months (*) according to the 75% empirical rule The EU DataTAG Project
Multiple Gigabit/second networking Facts, Theory & Practice (2) • Practice: • Assuming suitable window size is use (i.e. bandwidth*RTT), the achieved throughput also depends on the packet size and the packet loss rate. • This means that with non-zero packet loss rates, higher throughput will be achieved using Gigabit Ethernet “Jumbo Frames”. • Could possibly conflict with strong security requirements in the presence of firewalls (e.g. throughput, transparency (e.g.TCP/IP window scaling option)) • Single stream vs multi-stream • Tuning the number of streams is probably as difficult as tuning single stream • However, as explained later multi-stream are a very effective way to bypass the deficiencies of TCP/IP The EU DataTAG Project
Single stream vs Multiple streams (1) • Why do multiple streams normally yield higher aggregate throughput than a single stream, in the presence of packet losses? • Assume we have a 200ms RTT (e.g. CERN-Caltech) and a 10Gbps link • The size of the window is computed according to the following formula: • Window Size = Bandwidth*RTT (i.e. 250MB at 10Gbps & 200ms RTT): • With no packet losses, one 10Gbps stream or two 5Gbps streams are equivalent, • even though the CPU load on the end systems may not be the same. The EU DataTAG Project
Single stream vs Multiple streams (2) • With one packet loss, the 10Gbps stream will reduce its window to 5Gbps and will then increase by one MSS (1500 bytes) per RTT, • therefore the average rate during the congestion avoidance phase will be 7.5 Gbps, at best. • With one packet loss and two 5Gbps streams, only one stream is affected and the congestion avoidance phase is shorter (i.e. almost half) because RTTs are hardly affected by the available bandwidth, so, • the average rate will be 3.75Gbps, and the aggregate throughput will be 8.75Gbps, • In addition the 10Gbps regime will be reached faster. The EU DataTAG Project
Single stream vs Multiple streams (3)effect of a single packet loss (e.g. link error, buffer overflow) Streams/Throughput 10 5 1 7.5 4.375 2 9.375 10 Avg. 7.5 Gbps Throughput Gbps 7 5 Avg. 6.25 Gbps Avg. 4.375 Gbps 5 Avg. 3.75 Gbps 2.5 T = 2.37 hours! (RTT=200msec, MSS=1500B) Time T T T T The EU DataTAG Project
Single stream vs Multiple streams (4)effect of two packet losses (e.g. link error, buffer overflow) Streams/Throughput 10 5 1 6.25 4.583 2 9.166 Avg. 6.25 Gbps 10 Throughput Gbps Avg. 8.75 Gbps 7 5 Avg. 6.25 Gbps Avg. 4.375 Gbps 5 1 packet losses on two 5Gbps streams Avg. 4.583 Gbps Avg. 3.75 Gbps 2.5 T = 2.37 hours! (RTT=200msec, MSS=1500B) 2 packet losses on one 5Gbps stream Time T T T T The EU DataTAG Project
Multiple Gigabit/second networking (tentative conclusions) • Are TCP's "congestion avoidance" algorithms compatible with high speed, long distance networks? • The "cut transmit rate in half on single packet loss and then increase the rate additively (1 MSS by RTT)" algorithm, also called AIMD “additive increase, multiplicative decrease” may simply not work. • New TCP/IP adaptations may be needed in order to better cope with “lfn”, e.g. TCP Vegas, but simpler changes can also be thought of. • Non-Tcp/ip based transport solution, use of Forward Error Corrections (FEC), Early Congestion Notifications (ECN) rather than active queue management techniques (RED/WRED)? • We should work closely with the Web100 & Net100 projects • Web100 (http://www.web100.org/), a 3MUSD NSF project, might help enormously! • better TCP/IP instrumentation (MIB) • self-tuning • tools for measuring performance • improved FTP implementation • Net100 http://www.net100.org/ (complementary, DoE funded, project) • Development of network-aware operating systems The EU DataTAG Project