260 likes | 430 Views
Computational Infrastructures for Science Marty Humphrey Assistant Professor Computer Science Department University of Virginia NeSSI Workshop October 13, 2003. “Traditional” Computational Science. SP3, O2K, Linux clusters, etc. PBS, LSF, LoadLeveler, etc. Archival storage MPI Viz SSH, SCP.
E N D
Computational Infrastructures for ScienceMarty HumphreyAssistant ProfessorComputer Science DepartmentUniversity of VirginiaNeSSI WorkshopOctober 13, 2003
“Traditional” Computational Science • SP3, O2K, Linux clusters, etc. • PBS, LSF, LoadLeveler, etc. • Archival storage • MPI • Viz • SSH, SCP
Grid Definition(Foster and Kesselman) • “Coordinates resources that are not subject to centralized control….” • “Using standard, open, general-purpose protocols and interfaces…” • “To deliver non-trivial qualities of service.”
Grid “Operating System” Grid Computing Host/OS 1,1 Host/OS 2,1 Host/OS 3,1
Who cares where it is? It must always be available when I need it Make it secure no one can steal my data no one can pretend to be me don’t tell me who I will/can trust Choose secure, fast, cheap resources Give me reasonable quality of service Don’t make me manually move/copy stuff around Don’t make me learn a new OS Allow me to run my existing apps I don’t want errors If errors occur, tell me in plain English how I can avoid them next time Allow me to more easily collaborate Grid User Wish-List Darnit, make my life easier !
Example: Transparent Remote Execution • User initiates “run” • User/Grid SW selects site/resource • Grid SW copies binaries (if necessary) • Grid SW copies/moves input files • Grid SW starts job(s) • Grid SW monitors progress • Grid SW copies output files Forms the basis of parameter-space or monte carlos
Grid Focus: Virtual Organizations • Logical grouping of resources and users • Support community-specific discovery • Specialized “views” • Dynamic collaborations of individuals and institutions • Policy negotiation and enforcementwill be key issues looking forward
Grid Landscape Today: Globus • Grid Resource Allocation and Management (GRAM) • Gatekeeper, Jobmanager (RSL “schedulerspeak”) • Grid Security Infrastructure (GSI) • Metacomputing Directory Service (MDS) (via OpenLDAP) • Grid Index Information Service (GIIS) • Grid Resource Information Service (GRIS) • GridFTP
Grid Landscape Today: Globus (cont.) • “Add-ons”: • MPICH-G2 • Replica Catalog and Management • Community Authorization Service (CAS) • Condor-G • etc. • Basis of many large-scale Grids…
g g g g g g Selected Major Grid Projects (Oct 2001) New New
g g g g g g Selected Major Grid Projects New New New New New
g g g g g g Selected Major Grid Projects New New
g g Selected Major Grid Projects New New
Slide courtesy of Paul Avery PetaScale Virtual-Data Grids Production Team Individual Investigator Workgroups ~1 Petaflop ~100 Petabytes Interactive User Tools Request Planning & Request Execution & Virtual Data Tools Management Tools Scheduling Tools Resource Other Grid • Resource • Security and • Other Grid Security and Management • Management • Policy • Services Policy Services Services • Services • Services Services Transforms Distributed resources(code, storage, CPUs,networks) Raw data source
MCAT; GriPhyN catalogs MDS MDS GDMP DAGMAN, Kangaroo GSI, CAS Globus GRAM GridFTP; GRAM; SRM Data Grid Architecture Slide courtesy of Ian Foster Application DAG Catalog Services Monitoring Planner Info Services DAG Repl. Mgmt. Executor Policy/Security Reliable Transfer Service Compute Resource Storage Resource
SKC Boston U Wisconsin Michigan PSU BNL Fermilab LBL Argonne J. Hopkins NCSA Indiana Hampton Caltech Oklahoma Vanderbilt UCSD/SDSC FSU Arlington UF Tier1 FIU Tier2 Brownsville Tier3 Slide courtesy of Paul Avery US-iVDGL Data Grid Partners? • EU • CERN • Brazil • Australia • Korea • Japan
~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Data Grids for High Energy Physics Image courtesy Harvey Newman, Caltech
Global Grid Forum (GGF) • Grid standards • Best practices • Broad academic, national lab, and industry involvement • Areas: Applications and programming environments, architecture, data, information systems and performance, Peer-to-Peer, Scheduling and Resource Management, Security • GGF9 was last week in Chicago
Many Excellent DOE Grid and Middleware Projects • Reliable and Secure Group Communication • Commodity Grid Kits (CoGKits) • Middleware for Science Portals • Scientific Annotation Middleware (SAM) • Storage Resource Management for Data Grid Applications • Common Component Architecture (CCA) • Scalable Software Initiative
Next-Generation Grids • Web Services • “Semantically encapsulate discrete functionality” • Loosely coupled, reusable components • XML, SOAP, WSDL, UDDI, etc. • Broad industrial support: Microsoft, IBM, Sun, BEA, etc. • Open Grid Services Architecture (OGSA) • Combine Grids (Globus, Legion) with Web Services • GT3: Java, AXIS, J2EE, etc.
OGSI.NET • University of Virginia hosting environment for Grid Services based on Microsoft Web Services approach • Focus: Grid security (e.g., explicit trust management) • Focus: Grid programming models • Focus: Connection between UNIX and Win*
Grid Challenges: “UK E-Science Gap Analysis”(Fox and Walker, Jun 30 2003) • Security: VPNs/Firewalls, fine-grain access control • Workflow (“orchestration”) specs and engines • Fault tolerance • Grid adaptability (e.g., real-time support) • Ease of use • Grid federations
Future Directions • Grid has come a long way • Merging of Grid and Web Services shows promise • Many difficult issues remain • Manageable security • Integration with legacy applications/tools • Challenge for SNS: Identify and meet requirements not being met by current Grid technologies