1 / 54

Harvey B Newman, Professor of Physics LHCNet PI, US CMS Collaboration Board Chair

HENP Grids and Networks Global Virtual Organizations. Harvey B Newman, Professor of Physics LHCNet PI, US CMS Collaboration Board Chair WAN In Lab Site Visit, Caltech Meeting the Advanced Network Needs of Science March 5, 2003. Computing Challenges: Petabyes, Petaflops, Global VOs.

arista
Download Presentation

Harvey B Newman, Professor of Physics LHCNet PI, US CMS Collaboration Board Chair

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HENP Grids and Networks Global Virtual Organizations • Harvey B Newman, Professor of Physics • LHCNet PI, US CMS Collaboration Board Chair • WAN In Lab Site Visit, CaltechMeeting the Advanced Network Needs of ScienceMarch 5, 2003

  2. Computing Challenges: Petabyes, Petaflops, Global VOs • Geographical dispersion: of people and resources • Complexity: the detector and the LHC environment • Scale: Tens of Petabytes per year of data 5000+ Physicists 250+ Institutes 60+ Countries • Major challenges associated with: • Communication and collaboration at a distance • Managing globally distributed computing & data resources • Cooperative software development and physics analysis • New Forms of Distributed Systems: Data Grids

  3. Next Generation Networks for Experiments: Goals and Needs • Providing rapid access to event samples, subsets and analyzed physics results from massive data stores • From Petabytes by 2002, ~100 Petabytes by 2007, to ~1 Exabyte by ~2012. • Providing analyzed results with rapid turnaround, bycoordinating and managing the large but LIMITED computing, data handling and NETWORK resources effectively • Enabling rapid access to the data and the collaboration • Across an ensemble of networks of varying capability • Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANs • With reliable, monitored, quantifiable high performance Large data samples explored and analyzed by thousands of globally dispersed scientists, in hundreds of teams

  4. TOTEM LHC • pp s = 14 TeV Ldesign = 1034cm-2 s-1 • Heavy ions (e.g. Pb-Pb ats ~ 1000 TeV) 27 Km ring 1232 dipoles B=8.3 T (NbTi at 1.9 K) CMS and ATLAS: pp, general purpose First Beams : April 2007 Physics Runs: July 2007 ALICE : heavy ions, p-ions LHCb : pp, B-physics

  5. LHC Collaborations CMS ATLAS US The US provides about 20-25% of the author list in both experiments

  6. US LHC INSTITUTIONS US CMS Accelerator US ATLAS

  7. Four LHC Experiments: The Petabyte to Exabyte Challenge • ATLAS, CMS, ALICE, LHCBHiggs + New particles; Quark-Gluon Plasma; CP Violation • Data stored ~40 Petabytes/Year and UP; CPU 0.30 Petaflops and UP • 0.1 to 1 Exabyte (1 EB = 1018 Bytes) (2008) (~2013 ?) for the LHC Experiments

  8. LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate 109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)

  9. Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS LHC Data Grid Hierarchy CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1 ~PByte/sec ~100-1500 MBytes/sec Online System Experiment CERN 700k SI95 ~1 PB Disk; Tape Robot Tier 0 +1 HPSS ~2.5-10 Gbps Tier 1 FNAL: 200k SI95; 600 TB IN2P3 Center INFN Center RAL Center 2.5-10 Gbps Tier 2 ~2.5-10 Gbps Tier 3 Institute ~0.25TIPS Institute Institute Institute Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache 0.1–10 Gbps Tier 4 Workstations

  10. Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*] [*] Installed BW. Maximum Link Occupancy 50% Assumed See http://gate.hep.anl.gov/lprice/TAN

  11. History – One large Research Site Much of the Traffic:SLAC IN2P3/RAL/INFN;via ESnet+France;Abilene+CERN Current Traffic ~400 Mbps;ESNet LimitationProjections: 0.5 to 24 Tbps by ~2012

  12. Progress: Max. Sustained TCP Thruput on Transatlantic and US Links • 8-9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN • 11/5/01 125 Mbps in One Stream (modified kernel): CIT-CERN • 1/09/02 190 Mbps for One stream shared on 2 155 Mbps links • 3/11/02 120 Mbps Disk-to-Disk with One Stream on 155 Mbps link (Chicago-CERN) • 5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams • 6/1/02 290 Mbps Chicago-CERN One Stream on OC12 (mod. Kernel) • 9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, OC48 Link • 11-12/02 FAST: 940 Mbps in 1 Stream SNV-CERN; 9.4 Gbps in 10 Flows SNV-Chicago * Also see http://www-iepm.slac.stanford.edu/monitoring/bulk/; and the Internet2 E2E Initiative: http://www.internet2.edu/e2e

  13. US-CERN OC48 Deployment • Phase One • Phase two American partners European partners OC12 (Production) Cisco 7609 Cisco 7609 OC48 (Develop and Test) CERN - Geneva Caltech (DOE) PoP in Chicago Cisco 7606 OC48 (Develop and Test) Cisco 7606 American partners European partners Cisco 7609 Cisco 7609 OC12 (Production) CERN Caltech (DOE) PoP in Chicago CERN - Geneva

  14. OC-48 deployment (Cont’d) • Phase three (Late 2002) Alcatel 7770 DataTAG (CERN) Alcatel 7770 DataTAG (CERN) Cisco 7606Caltech(DoE) Cisco 7606DataTAG (CERN) Optical Mux/Dmux Datag (CERN) Optical Mux/Dmux Datag (CERN) European partners OC48 2.5 Gbps American partners Juniper M10 Caltech (DoE) Juniper M10 DataTAG(CERN) Cisco 7609Caltech(DoE) Cisco 7609DataTAG(CERN) CERN OC12 622 Mbps • Separate environments for tests and production • Transatlantic testbed dedicated to advanced optical network research and intensive data access applications

  15. UK SuperJANET4 NL Atrium VTHD SURFnet GEANT It GARR-B Fr INRIA DataTAG Project NewYork ABILENE STARLIGHT ESNET 2.5 to 10G GENEVA Wave Triangle 10 G 10G CALREN2 STAR-TAP • EU-Solicited Project. CERN, PPARC (UK), Amsterdam (NL), and INFN (IT);and US (DOE/NSF: UIC, NWU and Caltech) partners • Main Aims: • Ensure maximum interoperability between US and EU Grid Projects • Transatlantic Testbed for advanced network research • 2.5 Gbps Wavelength Triangle from 7/02; to 10 Gbps Triangle by Early 2003

  16. LHCnet Network : March 2003 GEANT Switch IN2P3 WHO CERN -Geneva Alcatel 7770 DataTAG (CERN) Cisco 7609DataTAG (CERN) Juniper M10 DataTAG(CERN) Linux PC for Performance tests & Monitoring Cisco 7606 CERN Optical Mux/Dmux Alcatel 1670 OC48 – 2,5 Gbps OC12 – 622 Mbps Linux PC for Performance tests & Monitoring Optical Mux/DmuxAlcatel 1670 Cisco 7606 Caltech(DoE) Alcatel 7770 DataTAG (CERN) Cisco 7609Caltech(DoE) Juniper M10 Caltech (DoE) Caltech/DoE PoP – StarLight Chicago Abilene ESnet NASA MREN STARTAP Development and tests

  17. Rf (s) TCP AQM p Rb’(s) Theory Internet: distributed feedback system Experiment Geneva 7000km Sunnyvale Baltimore 3000km 1000km Chicago FAST (Caltech): A Scalable, “Fair” Protocol for Next-Generation Networks: from 0.1 To 100 Gbps SC2002 11/02 Highlights of FAST TCP • Standard Packet Size • 940 Mbps single flow/GE card • 9.4 petabit-m/sec • 1.9 times LSR • 9.4 Gbps with 10 flows • 37.0 petabit-m/sec • 6.9 times LSR • 22 TB in 6 hours; in 10 flows • Implementation • Sender-side (only) mods • Delay (RTT) based • Stabilized Vegas Sunnyvale-Geneva Baltimore-Geneva Baltimore-Sunnyvale SC2002 10 flows SC2002 2 flows I2 LSR 29.3.00 multiple SC2002 1 flow 9.4.02 1 flow 22.8.02 IPv6 URL: netlab.caltech.edu/FAST Next: 10GbE; 1 GB/sec disk to disk C. Jin, D. Wei, S. Low FAST Team & Partners

  18. TeraGrid (www.teragrid.org)NCSA, ANL, SDSC, Caltech, PSC A Preview of the Grid Hierarchyand Networks of the LHC Era Abilene Chicago DTF Backplane: 4 X 10 Gbps Indianapolis Urbana Caltech Starlight / NW Univ UIC San Diego I-WIRE Multiple Carrier Hubs Ill Inst of Tech ANL Univ of Chicago OC-48 (2.5 Gb/s, Abilene) Indianapolis (Abilene NOC) Multiple 10 GbE (Qwest) NCSA/UIUC Multiple 10 GbE (I-WIRE Dark Fiber) Source: Charlie Catlett, Argonne

  19. National Light Rail Footprint SEA POR SAC BOS NYC CHI OGD DEN SVL CLE WDC PIT FRE KAN RAL NAS STR LAX PHO WAL ATL SDG OLG DAL JAC 15808 Terminal, Regen or OADM site Fiber route NLR • Buildout Started November 2002 • Initially 4 10G Wavelengths • To 40 10G Waves in Future Transition now to optical, multi-wavelength R&E networks: US, Europe and Intercontinental (US-China-Russia) Initiatives;Efficient use of Wavelengths is an Essential Part of this Picture

  20. HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps Continuing the Trend: ~1000 Times Bandwidth Growth Per Decade; We are Rapidly Learning to Use and Share Multi-Gbps Networks

  21. HENP Lambda Grids:Fibers for Physics • Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Stores • Survivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007)requires that each transaction be completed in a relatively short time. • Example: Take 800 secs to complete the transaction. Then • Transaction Size (TB)Net Throughput (Gbps) • 1 10 • 10 100 • 100 1000 (Capacity of Fiber Today) • Summary: Providing Switching of 10 Gbps wavelengthswithin ~3-5 years; and Terabit Switching within 5-8 yearswould enable “Petascale Grids with Terabyte transactions”,as required to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields.

  22. Emerging Data Grid User Communities • NSF Network for Earthquake Engineering Simulation (NEES) • Integrated instrumentation, collaboration, simulation • Grid Physics Network (GriPhyN) • ATLAS, CMS, LIGO, SDSS • Access Grid; VRVS: supporting group-based collaboration • And • Genomics, Proteomics, ... • The Earth System Grid and EOSDIS • Federating Brain Data • Computed MicroTomography … • Virtual Observatories

  23. COJAC: CMS ORCA Java Analysis Component: Java3D Objectivity JNI Web Services Demonstrated Caltech-Riode Janeiro (2/02) and Chile (5/02)

  24. CAIGEE • CMS Analysis – an Integrated Grid Enabled Environment CIT, UCSD, Riverside, Davis; + UCLA, UCSB • NSF ITR – 50% so far • Lightweight, functional, making use of existing software AFAP • Plug-in Architecture based on Web Services • Expose Grid “Global System” to physicists – at various levels of detail with Feedback • Supports Request, Preparation, Production, Movement, Analysisof Physics Object Collections • Initial Target: Californian US-CMS physicists • Future: Whole US CMS and CMS

  25. Client RPC WEB CLARENS SERVICE HTTP/SOAP/RPC Server Clarens • The Clarens Remote (Light-Client) Dataserver: a WAN system for remote data analysis • Clarens servers are deployed at Caltech, Florida, UCSD, FNAL, Bucharest; Extend to UERJ in Rio (CHEPREO) • SRB now installed as Clarens service on Caltech Tier2 (Oracle backend)

  26. NSF ITR: Globally EnabledAnalysis Communities • Develop and build Dynamic Workspaces • Build Private Grids to support scientific analysis communities • Using Agent Based Peer-to-peer Web Services • Construct Autonomous Communities Operating Within Global Collaborations • Empower small groups of scientists (Teachers and Students) to profit from and contribute to int’l big science • Drive the democratization of science via the deployment of new technologies

  27. NSF ITR: Key New Concepts • Dynamic Workspaces • Provide capability for individual and sub-community to request and receive expanded, contracted or otherwise modified resources, while maintaining the integrity and policies of the Global Enterprise • Private Grids • Provide capability for individual and community to request, control and use a heterogeneous mix of Enterprise wide and community specific software, data, meta-data, resources • Build on Global Managed End-to-end Grid Services Architecture; Monitoring System • Autonomous, Agent-Based, Peer-to-Peer

  28. Private Grids and P2P Sub-Communities in Global CMS

  29. A Global Grid Enabled Collaboratory for Scientific Research (GECSR) • A joint ITR proposal from • Caltech (HN PI,JB:CoPI) • Michigan (CoPI,CoPI) • Maryland (CoPI) • and Senior Personnel from • Lawrence Berkeley Lab • Oklahoma • Fermilab • Arlington (U. Texas) • Iowa • Florida State • The first Grid-enabled Collaboratory: Tight integration between • Science of Collaboratories, • Globally scalable working environment • A Sophisticated Set of Collaborative Tools(VRVS, VNC; Next-Gen) • Agent based monitoring and decision support system (MonALISA)

  30. GECSR • Initial targets are the global HENP collaborations, but GESCR is expected to be widely applicable to other large scale collaborative scientific endeavors • “Giving scientists from all world regions the means to function as full partners in the process of search and discovery” The importance of Collaboration Services is highlighted in the Cyberinfrastructure report of Atkins et al. 2003

  31. Current Grid Challenges: SecureWorkflow Management and Optimization • Maintaining a Global View of Resources and System State • Coherent end-to-end System Monitoring • Adaptive Learning: new algorithms and strategiesfor execution optimization (increasingly automated) • Workflow: Strategic Balance of Policy Versus Moment-to-moment Capability to Complete Tasks • Balance High Levels of Usage of Limited Resources Against Better Turnaround Times for Priority Jobs • Goal-Oriented Algorithms; Steering Requests According to (Yet to be Developed) Metrics • Handling User-Grid Interactions: Guidelines; Agents • Building Higher Level Services, and an IntegratedScalable User Environment for the Above

  32. 14000+ Hosts;8000+ Registered Users in 64 Countries 56 (7 I2) Reflectors Annual Growth 2 to 3X

  33. MonaLisa: A Globally Scalable Grid Monitoring System By I. Legrand (Caltech) • Deployed on US CMS Grid • Agent-based Dynamic information / resource discovery mechanism • Talks w/Other Mon. Systems • Implemented in • Java/Jini; SNMP • WDSL / SOAP with UDDI • Part of a Global “Grid Control Room” Service

  34. Lookup Discovery Service Lookup Service Service Listener Lookup Service Remote Notification Registration Station Server Station Server Station Server Proxy Exchange Distributed System Services Architecture (DSSA): CIT/Romania/Pakistan • Agents: Autonomous, Auto-discovering, self-organizing, collaborative • “Station Servers” (static) host mobile “Dynamic Services” • Servers interconnect dynamically; form a robust fabric in which mobile agents travel, with a payload of (analysis) tasks • Adaptable to Web services: OGSA; and many platforms • Adaptable to Ubiquitous, mobile working environments Managing Global Systems of Increasing Scope and Complexity, In the Service of Science and Society, Requires A New Generation of Scalable, Autonomous, Artificially Intelligent Software Systems

  35. MONARC SONN: 3 Regional Centres Learning to Export Jobs (Day 9) <E> = 0.83 <E> = 0.73 1MB/s ; 150 ms RTT CERN30 CPUs CALTECH 25 CPUs 1.2 MB/s 150 ms RTT 0.8 MB/s 200 ms RTT NUST 20 CPUs <E> = 0.66 Day = 9 By I. Legrand

  36. Networks, Grids, HENP and WAN-in-Lab • Current generation of 2.5-10 Gbps network backbones arrived in the last 15 Months in the US, Europe and Japan • Major transoceanic links also at 2.5 - 10 Gbps in 2003 • Capability Increased ~4 Times, i.e. 2-3 Times Moore’s • Reliable high End-to-end Performance of network applications(large file transfers; Grids) is required. Achieving this requires: • A Deep understanding of Protocol Issues, for efficient use of wavelengths in the 1 to 10 Gbps range now, and higher speeds (e.g. 40 to 80 Gbps) in the near future • Getting high performance (TCP) toolkits in users’ hands • End-to-end monitoring; a coherent approach • Removing Regional, Last Mile Bottlenecks and Compromises in Network Quality are now On the critical path, in all regions • We will Work in Concert with AMPATH, Internet2, Terena, APAN; DataTAG, the Grid projects and the Global Grid Forum • A WAN in Lab facility, available to the Community, is a Key Element in achieving these revolutionary goals

  37. Some Extra Slides Follow

  38. Global Networks for HENP • National and International Networks, with sufficient (rapidly increasing) capacity and capability, are essential for • The daily conduct of collaborative work in both experiment and theory • Detector development & construction on a global scale; Data analysis involving physicists from all world regions • The formation of worldwide collaborations • The conception, design and implementation of next generation facilities as “global networks” • “Collaborations on this scale would never have been attempted, if they could not rely on excellent networks”

  39. The Large Hadron Collider (2007-) • The Next-generation Particle Collider • The largest superconductor installation in the world • Bunch-bunch collisions at 40 MHz,Each generating ~20 interactions • Only one in a trillion may lead to a major physics discovery • Real-time data filtering: Petabytes per second to Gigabytes per second • Accumulated data of many Petabytes/Year

  40. Each center has: 2-6 physicist mentors 2-12 teachers* *Depending on year of the program and local variations Education and Outreach QuarkNet has 50 centers nationwide (60 planned)

  41. A transatlantic testbed • Multi platforms • Vendor independent • Interoperability tests • Performance tests • Multiplexing of optical signals into a single OC- 48 transatlantic optical channel • Layer 2 services: • Circuit-Cross-Connect (CCC) • Layer 2 VPN • IP services: • Multicast • IPv6 • QoS • Future: GMPLS • GMPLS is an extension of MPLS

  42. Service Implementation example Abilene & ESnet GEANT Host 2 BGP BGP 10 GbE 2,5 Gb/s IBGP CCC VLAN 600 2,5 Gb/s GbE CERN Network Host 1 OSPF VLAN601 Optical multiplexer Optical multiplexer Host 1 Host 2 Host 3 Host 3 StarLight- Chicago CERN - Geneva

  43. Logical view of previous implementation Abilene & ESnet GEANT 2,5 Gb/s POS IP over POS CERN Network Host 1 Host 1 VLAN 600 Ethernet over MPLS over POS VLAN 600 Host 2 Host 2 VLAN 601 IP over MPLS over POS VLAN 601 Host 3 Host 3 Ethernet over POS

  44. Using Web Services for Tag Data (Example) • Use ~180,000 Tag objects derived from di-jet ORCA events • Each Tag: run & event number, OID of ORCA event object, then E,Phi,Theta,ID for 5 most energetic particles, and E,Phi,Theta for 5 most energetic jets • These Tag events have been used in various performance tests and demonstrations, e.g. SC2000, SC2001, comparison of Objy vs RDBMS query speeds (GIOD), as source data for COJAC, etc.

  45. DB-Independent Access to Object Collections: Middleware Prototype • First layer ODBC provides database vendor abstraction, allowing any relational (SQL) database to be plugged into the system. • Next layer OTL provides an encapsulation of the results of a SQL query in a form natural to C++, namely STL (standard template library) map and vector objects. • Higher levels map C++ object collections to the client’s required format, and transport the results to the client.

  46. Reverse Engineer and Ingest the CMS JETMET nTuples • From the nTuple description: • Derived an ER diagram for the content • An AOD for the JETMET analysis • We then wrote tools to: • Automatically generate a set of SQL CREATE TABLE commands to create the RDBMS tables • Generate SQL INSERT and bulk load scripts that enable population of the RDBMS tables • We imported the ntuples into • SQLServer at Caltech • Oracle 9i at Caltech • Oracle 9i at CERN • PostgreSQL at Florida • In the future: • Generate a “Tag” table of data that captures the most often used data columns in the nTuple

  47. iGrid2002 (Amsterdam) GAE Demonstrations

  48. Clarens (Continued) • Servers • Multi-process (based on Apache), using XML/RPC • Similar, but using SOAP • Lightweight (using select loop) single-process • Functionality: file access (read/download/part/whole), directory listing, file selection, file checksumming, access to SRB, security with PKI/VO infrastructure, RDBMS data selection/analysis, MonaLisa integration • Clients • ROOT client … browsing remote file repositories, files • Platform-independent Python-based client for rapid prototyping • Browser-based Java/Javascript client (in planning) … for use when no other client packages is desired • Some clients of historical interest e.g. Objectivity • Source/Discussion • http://clarens.sourceforge.net • Future • Expose POOL as remote data source in Clarens • Clarens peer-to-peer discovery and communication

  49. Globally Scalable Monitoring Service I. Legrand Discovery Lookup Service Proxy Lookup Service Client (other service) Registration Push & Pull rsh & ssh scripts; snmp RC Monitor Service • Component Factory • GUI marshaling • Code Transport • RMI data access Farm Monitor Farm Monitor

  50. NSF/ITR: A Global Grid-Enabled Collaboratory for Scientific Research(GECSR)andGrid Analysis Environment (GAE)

More Related