170 likes | 281 Views
International Grid Communities. Dr. Carl Kesselman carl@isi.edu Information Sciences Institute University of Southern California. The Grid Problem. Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations. Enabling International Cooperation.
E N D
International Grid Communities Dr. Carl Kesselman carl@isi.edu Information Sciences Institute University of Southern California
The Grid Problem Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
Enabling International Cooperation • International cooperation valuable, because • Scale of Grid problem is large • Expertise on both sides of Atlantic & Pacific • Important international applications • Cost of noncooperation can be high • Useful cooperation will not just happen but must be explicitly encouraged • Substantial testbed & application projects, jointly sponsored by EU, US, others • Transatlantic ‘Terabit’ Testbed, etc. • International Virtual Data Grid Laboratory
Grid Forum • IETF like body to codify standard practice • Two meetings held so far, next in April • European Grid forum established to address Europe specific issues
Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link Layered Grid Architecture(By Analogy to Internet Architecture) Application
The Grid Physics Network • Petabyte-scale computational environment for data intensive science • CMS and Atlas Projects of the Large Hadron Collider • Laser Interferometer Gravitational-Wave Observatory • Sloan Digital Sky Survey (200 million objects each with ~100 attributes)
Data Grids • Integrate data archives into a distributed data management and analysis “Grid” • More than storage & network, also e.g. • Caching and mirroring to exploit locality • Intelligent scheduling to determine appropriate replica, site for (re)computation, etc. • Coordinated resource management for performance guarantees • Embedded security, policy, agent technologies for effective distributed analysis
Virtual Data Grids • Only raw data must exist • Dynamic data production • Large extent and scale • national or worldwide, multiple distance scales • large numbers of resources • Sophisticated new services • Coordinated use of remote resources • Transparency in data-handling and processing • Optimize for cost, time, policy constraints, …
~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Grid Communities & Applications:Data Grids for High Energy Physics Image courtesy Harvey Newman, Caltech
Transforms GriPhyn Architecture Production Team Individual Investigator Other Users Interactive User Tools Virtual Data Tools Request Planning and Scheduling Tools Request Execution Management Tools Other Grid Services Resource Management Services Security and Policy Services Raw data source Distributed resources (code, storage,computers, and network)
GriPhyn Usage Scenario Major Archive Facilities Network caches & regional centers Local sites ?
iVDGL • International Virtual-Data Grid Laboratory • A place to conduct Data Grid tests at scale • Concrete manifestation of world-wide grid activity • Continuing activity that will drive Grid awareness • A basis for further funding • Scale of effort • For national, intl scale Data Grid tests, operations • Computationally and data intensive computing • Fast networks • Who • Initially US-UK-EU; Japan, Australia • Other world regions later • Discussions w/ Russia, China, Pakistan, India, South America
iVDGL Architecture Application Experiments iGOC Experiment Management Experiment Scheduler Experiment Data Collection iVDGL Configuration Information Health and Status Monitoring iGLS Access Control and Policy Services iVDGL Mgmt. Interface iVDGL Monitoring Interface iVDGL Control Interface Local Management Interface Interface Compute Platform Storage Platform
Tier0/1 facility Tier2 facility Tier3 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link iVDGL Map Circa 2003-2004
iVDGL as a Laboratory • Grid Exercises • “Easy”, intra-experiment tests first (10-20%, national, transatlantic) • “Harder” wide-scale tests later (50-100% of all resources) • Local control of resources vitally important • Experiments, politics demand it • Strong interest from other disciplines • HEP + NP experiments • Virtual Observatory (VO) community in Europe/US • Gravity wave community in Europe/US/(Japan?) • Earthquake engineering • Bioinformatics • Computer scientists (wide scale tests)
Conclusions • Application communities for major Grid experiments are international • More communities then those mentioned • International testbeds are coming • Wires are only part of the solution • Common middleware archecture enabling technology