450 likes | 546 Views
The Advanced Networks and Services Underpinning the Large-Scale Science of DOE’s Office of Science The Evolution of Production Networks Over the Next 10 Years to Support Large-Scale International Science An ESnet View. William E. Johnston, wej@es.net ESnet Manager and Senior Scientist
E N D
The Advanced Networks and ServicesUnderpinning the Large-Scale Science ofDOE’s Office of ScienceThe Evolution of Production NetworksOver the Next 10 Yearsto Support Large-Scale International ScienceAn ESnet View William E. Johnston, wej@es.net ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory www.es.net
DOE Office of Science Drivers for Networking • The DOE Office of Science supports more than 40% of all US R&D in high-energy physics, nuclear physics, and fusion energy sciences (http://www.science.doe.gov) • This large-scale science that is the mission of the Office of Science depends on high-speed networks for • Sharing of massive amounts of data • Supporting thousands of collaborators world-wide • Distributed data processing • Distributed simulation, visualization, and computational steering • Distributed data management • The role of ESnet is to provides networking that supports and anticipates these uses for the Office of Science Labs and their collaborators • The issues were explored in two Office of Science workshops that formulated networking requirements to meet the needs of the science programs (see refs.)
Increasing Large-Scale Science Collaborationis Reflected in Network Usage • As of May, 2005 ESnet is transporting about 530 Terabytes/mo. • ESnet traffic has increased by 10X every 46 months, on average, since 1990 ESnet Monthly Accepted Traffic Feb., 1990 – May, 2005 TBytes/Month Feb, 90 Aug, 90 Feb, 91 Aug, 91 Feb, 92 Aug, 92 Feb, 93 Aug, 93 Feb, 94 Aug, 94 Feb, 95 Aug, 95 Feb, 96 Aug, 96 Feb, 97 Aug, 97 Feb, 98 Aug, 98 Feb, 99 Aug, 99 Feb, 00 Aug, 00 Feb, 01 Aug, 01 Feb, 02 Aug, 02 Feb, 03 Aug, 03 Feb, 04 Aug, 04 Feb, 05
Large-Scale Science Has Changed How the Network is Used Total ESnet traffic Feb., 2005 = 323 TBy in approx. 6,000,000,000 flows ESnetTop 100 Host-to-Host Flows, Feb., 2005 TBytes/Month DOE Lab-International R&E All other flows(< 0.28 TBy/month each) Lab-U.S. R&E (domestic) Lab-Lab(domestic) International Domestic Lab-Comm.(domestic) Inter-Lab • Top 100 flows = 84 TBy • A small number of large-scale science users now account fora significant fraction of all ESnet traffic • Over the next few years this will grow to be the dominate use of the network
Large-Scale Science Has Changed How the Network is Used • These flows are primarily bulk data transfer at this point and are candidates for circuit based services for several reasons • Traffic engineering – to manage the traffic on the backbone • Guaranteed bandwidth is needed to satisfy deadline scheduling requirements • Traffic isolation will permit the use of efficient, but TCP unfriendly, data transfer protocols
Virtual Circuit Network Services • A top priority of the science community • Today • Primarily to support bulk data transfer with deadlines • In the near future • Support for widely distributed Grid workflow engines • Real-time instrument operation • Coupled, distributed applications • To get an idea of how circuit services might be used to support the current trends, look at the one year history of the flows that are currently the top 20 • Estimate from the flow history what would be the characteristics of a circuit set up to manage the flow
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? LIGO – CalTech Over 1 year the “circuit” duration is about 3 months Gigabytes/day (no data)
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? SLAC - IN2P3 (FR)Over 1 year “circuit” duration is about 1 day to 1 week Gigabytes/day (no data)
Between ESnet, Abilene, GÉANT, and the connected regional R&E networks, there will be dozens of lambdas in production networks that are shared between thousands of users who want to use virtual circuits – Very complex inter-domain issues similar situationin GÉANT and theEuropean NRENs Abilene ESnet-Abilenex-connects ESnet similar situationin US regionals US R&E environment
OSCARS: Virtual Circuit Service • Despite the long circuit duration, these circuits cannot be managed by hand – too many circuits • There must automated scheduling, authorization, path analysis and selection, and path setup = management plane and control plane • Virtual circuits must operate across domains • End points will be on campuses or research institutes that are served by ESnet, Abilene’s regional networks, and GÉANT’s regional networks – typically five domains to cross to do end-to-end system connection • There are many issues here that are poorly understood • A collaboration between Internet2/HOPI, DANTE/GÉANT, and ESnet is building a prototype-production, interoperable service • ESnet virtual circuit project: On-demand Secure Circuits and Advance Reservation System (OSCARS) (Contact Chin Guok (chin@es.net) for information.)
What about lambda switching? • Two factors argue that this is a long ways out for production networks 1) There will not be enough lambdas available to satisfy the need • Just provisioning a single lambda ring around the US (7000miles -11,000km) is still about $2,000,000 even on R&E networks • This should drop by a factor of 5 -10 over next decade 2) Even if there were a “lot” of lambdas (hundreds?) there are thousands of large-scale science users • Just considering sites (and not scientific groups) there are probably 300 major research science research sites in the US and a comparable number in Europe • So, lambdas will have to be shared for the foreseeable future • Multiple QoS paths per lambda • Guaranteed minimum level of service for best effort traffic when utilizing the production IP networks • Allocation management • There will be hundreds to thousands of contenders with different science priorities
References – DOE Network Related Planning Workshops • 1) High Performance Network Planning Workshop, August 2002 http://www.doecollaboratory.org/meetings/hpnpw • 2) DOE Science Networking Roadmap Meeting, June 2003 http://www.es.net/hypertext/welcome/pr/Roadmap/index.html 3) DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003 http://www.csm.ornl.gov/ghpn/wk2003 4) Science Case for Large Scale Simulation, June 2003 http://www.pnl.gov/scales/ 5) Workshop on the Road Map for the Revitalization of High End Computing, June 2003 http://www.cra.org/Activities/workshops/nitrd http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report) 6) ASCR Strategic Planning Workshop, July 2003 http://www.fp-mcs.anl.gov/ascr-july03spw 7) Planning Workshops-Office of Science Data-Management Strategy, March & May 2004 • http://www-conf.slac.stanford.edu/dmw2004
ESnet Today Provides Global High-Speed Internet Connectivity forDOE Facilities and Collaborators SINet (Japan) Russia (BINP) CERN (USLHCnet CERN+DOE funded) GÉANT - France, Germany, Italy, UK, etc PNNL NERSC SLAC BNL MIT ANL LIGO INEEL LLNL LBNL MAN LANAbilene SNLL JGI TWC Starlight OSC GTNNNSA Lab DC Offices Chi NAP FNAL AMES JLAB PPPL ORNL SRS LANL SNLA DOE-ALB PANTEX ORAU NOAA OSTI ARM YUCCA MT BECHTEL-NV GA Abilene Abilene Abilene Abilene MAXGPoP KCP Allied Signal SoXGPoP NREL SNV Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren CA*net4 France GLORIAD (Russia, China)Korea (Kreonet2 MREN Netherlands StarTapTaiwan (TANet2, ASCC) PNWGPoP/PAcificWave SEA ESnet Science Data Network (SDN) core ESnet IP core NYC CHI-SL MAE-E SNV CHI Equinix PAIX-PA Equinix, etc. SNV SDN DC ATL SDSC ALB 42 end user sites ELP Office Of Science Sponsored (22) NNSA Sponsored (12) International (high speed) 10 Gb/s SDN core 10G/s IP core 2.5 Gb/s IP core MAN rings (≥ 10 G/s) OC12 ATM (622 Mb/s) OC12 / GigEthernet OC3 (155 Mb/s) 45 Mb/s and less Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) ESnet IP core: Packet over SONET Optical Ring and Hubs Laboratory Sponsored (6) commercial and R&E peering points ESnet core hubs IP high-speed peering points with Internet2/Abilene
DOE Office of Science Drivers for Networking • The DOE Office of Science supports more than 40% of all US R&D in high-energy physics, nuclear physics, and fusion energy sciences (http://www.science.doe.gov) • This large-scale science that is the mission of the Office of Science depends on networks for • Sharing of massive amounts of data • Supporting thousands of collaborators world-wide • Distributed data processing • Distributed simulation, visualization, and computational steering • Distributed data management • The role of ESnet is to provide networking that supports these uses for the Office of Science Labs and their collaborators • The issues were explored in two Office of Science workshops that formulated networking requirements to meet the needs of the science programs (see refs.)
HPSS HPSS HPSS HPSS Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center CERN / LHC High Energy Physics Data Provides One ofScience’s Most Challenging Data Management Problems (CMS is one of several experiments at LHC) ~100 MBytes/sec event simulation Online System ~PByte/sec Tier 0 +1 eventreconstruction HPSS human CERN LHC CMS detector 15m X 15m X 22m, 12,500 tons, $700M. 2.5-40 Gbits/sec Tier 1 German Regional Center French Regional Center FermiLab, USA Regional Center Italian Center ~0.6-2.5 Gbps analysis Tier 2 ~0.6-2.5 Gbps Tier 3 • 2000 physicists in 31 countries are involved in this 20-year experiment in which DOE is a major player. • Grid infrastructure spread over the US and Europe coordinates the data analysis Institute ~0.25TIPS Institute Institute Institute Physics data cache 100 - 1000 Mbits/sec Tier 4 Courtesy Harvey Newman, CalTech Workstations
LHC Networking • This picture represents the MONARCH model – a hierarchical, bulk data transfer model • Still accurate for Tier 0 (CERN) to Tier 1 (experiment data centers) data movement • Not accurate for the Tier 2 (analysis) sites which are implementing Grid based data analysis
Distributed Workflow • Distributed / Grid based workflow systems involve many interacting computing and storage elements that rely on “smooth” inter-element communication for effective operation • The new LHC Grid based data analysis model will involve networks connecting dozens of sites and thousands of systems for each analysis “center”
Example: Multidisciplinary Simulation A “complete” approach to climate modeling involves many interacting models and data that are provided by different groups at different locations (Tim Killeen, NCAR) Chemistry CO2, CH4, N2O ozone, aerosols Climate Temperature, Precipitation, Radiation, Humidity, Wind Heat Moisture Momentum CO2 CH4 N2O VOCs Dust Minutes-To-Hours Biogeophysics Biogeochemistry Carbon Assimilation Aero- dynamics Decomposition Water Energy Mineralization Microclimate Canopy Physiology Phenology Hydrology Inter- cepted Water Bud Break Soil Water Snow Days-To-Weeks Leaf Senescence Evaporation Transpiration Snow Melt Infiltration Runoff Gross Primary Production Plant Respiration Microbial Respiration Nutrient Availability Species Composition Ecosystem Structure Nutrient Availability Water Years-To-Centuries Ecosystems Species Composition Ecosystem Structure WatershedsSurface Water Subsurface Water Geomorphology Disturbance Fires Hurricanes Ice Storms Windthrows Vegetation Dynamics Hydrologic Cycle (Courtesy Gordon Bonan, NCAR: Ecological Climatology: Concepts and Applications. Cambridge University Press, Cambridge, 2002.)
Distributed Multidisciplinary Simulation • Distributed multidisciplinary simulation involves integrating computing elements at several remote locations • Requires co-scheduling of computing, data storage, and network elements • Also Quality of Service (e.g. bandwidth guarantees) • There is not a lot of experience with this scenario yet, but it is coming (e.g. the new Office of Science supercomputing facility at Oak Ridge National Lab has a distributed computing elements model)
ESnet Goal – 2009/2010 Major DOE Office of Science Sites AsiaPac • 10 Gbps enterprise IP traffic • 40-60 Gbps circuit based transport SEA Europe CERN CERN Aus. Europe ESnet Science Data Network (2nd Core – 30-50 Gbps,National Lambda Rail) Japan Japan CHI SNV Europe NYC DEN DC MetropolitanAreaRings Aus. ESnet IP Core (≥10 Gbps) ALB ATL SDG ESnet hubs New ESnet hubs Metropolitan Area Rings High-speed cross connects with Internet2/Abilene 10Gb/s 10Gb/s 30Gb/s40Gb/s Production IP ESnet core Science Data Network core Lab supplied Major international
Observed Drivers for the Evolution of ESnet ESnet is currently transporting About 530 Terabytes/mo.and this volume is increasing exponentially – ESnet traffic has increased by 10X every 46 months, on average, since 1990 ESnet Monthly Accepted Traffic Feb., 1990 – May, 2005 TBytes/Month Feb, 90 Aug, 90 Feb, 91 Aug, 91 Feb, 92 Aug, 92 Feb, 93 Aug, 93 Feb, 94 Aug, 94 Feb, 95 Aug, 95 Feb, 96 Aug, 96 Feb, 97 Aug, 97 Feb, 98 Aug, 98 Feb, 99 Aug, 99 Feb, 00 Aug, 00 Feb, 01 Aug, 01 Feb, 02 Aug, 02 Feb, 03 Aug, 03 Feb, 04 Aug, 04 Feb, 05
Observed Drivers: The Rise of Large-Scale Science • A small number of large-scale science users now account fora significant fraction of all ESnet traffic • ESnetTop 100 Host-to-Host Flows, Feb., 2005 Total ESnet traffic Feb., 2005 = 323 TBy in approx. 6,000,000,000 flows DOE Lab-International R&E TBytes/Month Lab-U.S. R&E (domestic) All other flows(< 0.28 TBy/month each) Lab-Lab(domestic) International Lab-Comm.(domestic) Domestic Inter-Lab • Top 100 flows = 84 TBy
Traffic Evolution over the Next 5-10 Years • The current traffic pattern trend of the large-scale science projects giving rise to the top 100 data flows that represent about 1/3 of all network traffic will continue to evolve • This evolution in traffic patterns and volume is driven by large-scale science collaborations and will result in large-scale science data flows overwhelming everything else on the network in 3-5 yrs. (WEJ predicts) • The top 100 flows will become the top 1000 or 5000 flows • These large flows will account for 75-95% of a much larger total ESnet traffic volume as • the remaining 6 billion flows will continue to account for the remainder of the traffic, which will also grow even as its fraction of the total becomes smaller
Virtual Circuit Network Services • Every requirements workshop involving the science community has put bandwidth-on-demand as the highest priority – e.g. for • Massive data transfers for collaborative analysis of experiment data • Real-time data analysis for remote instruments • Control channels for remote instruments • Deadline scheduling for data transfers • “Smooth” interconnection for complex Grid workflows
What is the Nature of the Required Circuits • Today • Primarily to support bulk data transfer with deadlines • In the near future • Support for widely distributed Grid workflow engines • Real-time instrument operation • Coupled, distributed applications • To get an idea of how circuit services might be used look at the one year history of the flows that are currently the top 20 • Estimate from the flow history what would be the characteristics of a circuit set up to manage the flow
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? LIGO – CalTech Over 1 year the “circuit” duration is about 3 months Gigabytes/day (no data)
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? SLAC - IN2P3 (FR)Over 1 year “circuit” duration is about 1 day to 1 week Gigabytes/day (no data)
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? SLAC - INFN (IT)Over 1 year “circuit” duration is about 1 to 3 months Gigabytes/day (no data)
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? FNAL - IN2P3 (FR)Over 1 year “circuit” duration is about 2 to 3 months Gigabytes/day (no data)
What are Characteristics of Today’s Flows – How “Dynamic” a Circuit? INFN (IT) - SLACOver 1 year “circuit” duration is about 3 weeks to 3 months Gigabytes/day (no data)
Characteristics of Today’s Circuits – How “Dynamic”? • These flows are candidates for circuit based services for two reasons • Traffic engineering – to manage the traffic on the IP production backbone • To satisfy deadline scheduling requirements • Traffic isolation to permit the use of efficient, but TCP unfriendly, data transfer protocols • Despite the long circuit duration, this cannot be managed by hand – too many circuits • There must automated scheduling, authorization, path analysis and selection, and path setup
Virtual Circuit Services - What about lambda switching? • Two factors argue that this is a long ways out for production networks 1) There will not be enough lambdas available to satisfy the need • Just provisioning a single lambda ring around the US (7000miles -11,000km) is still about $2,000,000 even on R&E networks • This should drop by a factor of 5 -10 over next 5 -10 years 2) Even if there were a “lot” of lambdas (hundreds?) there are thousands of large-scale science users • Just considering sites (and not scientific groups) there are probably 300 major research science research sites in the US and a comparable number in Europe • So, lambdas will have to be shared for the foreseeable future • Multiple QoS paths per lambda • Guaranteed minimum level of service for best effort traffic when utilizing the production IP networks • Allocation management • There will be hundreds to thousands of contenders with different science priorities
OSCARS: Guaranteed Bandwidth Service • Virtual circuits must operate across domains • End points will be on campuses or research institutes that are served by ESnet, Abilene’s regional networks, and GÉANT’s regional networks – typically five domains to cross to do end-to-end system connection • There are many issues here that are poorly understood • An ESnet – Internet2/HOPI – DANTE/GÉANT collaboration • ESnet virtual circuit project: On-demand Secure Circuits and Advance Reservation System (OSCARS) (Contact Chin Guok (chin@es.net) for information.)
OSCARS: Guaranteed Bandwidth Service bandwidthbroker allocationmanager authorization resource manager policer usersystem1 shaper path manager (dynamic, global view of network) site A resource manager • To address all of the issues is complex • There are many potential restriction points • There are many users that would like priority service, which must be rationed usersystem2 resource manager policer site B
ESnet 2010 Lambda Infrastructure and LHC T0-T1 Networking Toronto Vancouver TRIUMF CERN-1 CANARIE Seattle CERN-2 Boise CERN-3 BNL Chicago Clev New York Denver Sunnyvale KC Pitts GÉANT-1 FNAL Wash DC Raleigh Tulsa LA Albuq. Phoenix GÉANT-2 San Diego Atlanta Dallas Jacksonville El Paso - Las Cruces NLR PoPs Pensacola Baton Rouge Houston ESnet IP core hubs San Ant. ESnet Production IP core (10-20 Gbps) ESnet Science Data Network core (10G/link)(incremental upgrades, 2007-2010) Other NLR links CERN/DOE supplied (10G/link) International IP connections (10G/link) ESnet SDN/NLR hubs Tier 1 Centers Cross connects with Internet2/Abilene New hubs
Abilene* and LHC Tier 2, Near-Term Networking Vancouver Toronto TRIUMF CERN-1 CANARIE Seattle CERN-2 Boise CERN-3 Chicago BNL Clev New York Denver Sunnyvale KC Pitts GÉANT-1 FNAL Wash DC Raleigh Tulsa LA Albuq. Phoenix GÉANT-2 San Diego Atlanta Dallas Jacksonville El Paso - Las Cruces • Atlas Tier 2 Centers • University of Texas at Arlington • University of Oklahoma Norman • University of New Mexico Albuquerque • Langston University • University of Chicago • Indiana University Bloomington • Boston University • Harvard University • University of Michigan NLR PoPs Pensacola Baton Rouge Houston • CMS Tier 2 Centers • MIT • University of Florida at Gainesville • University of Nebraska at Lincoln • University of Wisconsin at Madison • Caltech • Purdue University • University of California San Diego ESnet IP core hubs San Ant. ESnet Production IP core (10-20 Gbps) ESnet Science Data Network core (10G/link)(incremental upgrades, 2007-2010) Other NLR links CERN/DOE supplied (10G/link) International IP connections (10G/link) ESnet SDN/NLR hubs < 10G connections to Abilene 10G connections to USLHC or ESnet Tier 1 Centers Abilene/GigaPoP nodes Cross connects with Internet2/Abilene USLHC nodes New hubs * WEJ projection of future Abilene
Between ESnet, Abilene, GÉANT, and the connected regional R&E networks, there will be dozens of lambdas in production networks that are shared between thousands of users who want to use virtual circuits – Very complex inter-domain issues similar situationin Europe Abilene ESnet-Abilenex-connects ESnet similar situationin US regionals US R&E environment
ESnet Optical Networking Roadmap • Dedicated virtual circuits • Dynamic virtual circuit allocation • GMPLS 2005 2006 2007 2008 2009 2010 • Interoperability between GMPLS circuits, VLANs, and MPLS circuits (Layer 1-3) • Interoperability between VLANs and MPLS circuits(Layer 2 & 3) • Dynamic provisioning of MPLS circuits (Layer 3)
Tying Domains Together (1/2) • Motivation: • For a virtual circuit service to be successful, it must • Be end-to-end, potentially crossing several administrative domains • Have consistent network service guarantees throughout the circuit • Observation: • Setting up an intra-domain circuit is easy compared with coordinating an inter-domain circuit • Issues: • Cross domain authentication and authorization • A mechanism to authenticate and authorize a bandwidth on-demand (BoD) circuit request must be agreed upon in order to automate the process • Multi-domain Acceptable Use Policies (AUPs) • Domains may have very specific AUPs dictating what the BoD circuits can be used for and where they can transit/terminate • Domain specific service offerings • Domains must have way to guarantee a certain level of service for BoD circuits • Security concerns • Are there mechanisms for a domain to protect itself? (e.g. RSVP filtering)
Tying Domains Together (2/2) • Approach: • Utilize existing standards and protocols (e.g. GMPLS, RSVP) • Adopt widely accepted schemas/services (e.g X.509 certificates) • Collaborate with like-minded projects (e.g. JRA3 (DANTE/GÉANT), BRUW (Internet2/HOPI) to: 1. Create a common service definition for BoD circuits 2. Develop an appropriate User-Network-Interface (UNI) and Network-Network-Interface (NNI)