1 / 26

“Traditional” Computational Science

Computational Infrastructures for Science Marty Humphrey Assistant Professor Computer Science Department University of Virginia NeSSI Workshop October 13, 2003. “Traditional” Computational Science. SP3, O2K, Linux clusters, etc. PBS, LSF, LoadLeveler, etc. Archival storage MPI Viz SSH, SCP.

ceri
Download Presentation

“Traditional” Computational Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Infrastructures for ScienceMarty HumphreyAssistant ProfessorComputer Science DepartmentUniversity of VirginiaNeSSI WorkshopOctober 13, 2003

  2. “Traditional” Computational Science • SP3, O2K, Linux clusters, etc. • PBS, LSF, LoadLeveler, etc. • Archival storage • MPI • Viz • SSH, SCP

  3. Grid Definition(Foster and Kesselman) • “Coordinates resources that are not subject to centralized control….” • “Using standard, open, general-purpose protocols and interfaces…” • “To deliver non-trivial qualities of service.”

  4. Grid “Operating System” Grid Computing Host/OS 1,1 Host/OS 2,1 Host/OS 3,1

  5. Who cares where it is? It must always be available when I need it Make it secure no one can steal my data no one can pretend to be me don’t tell me who I will/can trust Choose secure, fast, cheap resources Give me reasonable quality of service Don’t make me manually move/copy stuff around Don’t make me learn a new OS Allow me to run my existing apps I don’t want errors If errors occur, tell me in plain English how I can avoid them next time Allow me to more easily collaborate Grid User Wish-List Darnit, make my life easier !

  6. Example: Transparent Remote Execution • User initiates “run” • User/Grid SW selects site/resource • Grid SW copies binaries (if necessary) • Grid SW copies/moves input files • Grid SW starts job(s) • Grid SW monitors progress • Grid SW copies output files Forms the basis of parameter-space or monte carlos

  7. Web Interface for CHARMM and Amber

  8. Status of CHARMM or Amber Run

  9. Grid Focus: Virtual Organizations • Logical grouping of resources and users • Support community-specific discovery • Specialized “views” • Dynamic collaborations of individuals and institutions • Policy negotiation and enforcementwill be key issues looking forward

  10. Grid Landscape Today: Globus • Grid Resource Allocation and Management (GRAM) • Gatekeeper, Jobmanager (RSL  “schedulerspeak”) • Grid Security Infrastructure (GSI) • Metacomputing Directory Service (MDS) (via OpenLDAP) • Grid Index Information Service (GIIS) • Grid Resource Information Service (GRIS) • GridFTP

  11. Grid Landscape Today: Globus (cont.) • “Add-ons”: • MPICH-G2 • Replica Catalog and Management • Community Authorization Service (CAS) • Condor-G • etc. • Basis of many large-scale Grids…

  12. g g g g g g Selected Major Grid Projects (Oct 2001) New New

  13. g g g g g g Selected Major Grid Projects New New New New New

  14. g g g g g g Selected Major Grid Projects New New

  15. g g Selected Major Grid Projects New New

  16. Slide courtesy of Paul Avery PetaScale Virtual-Data Grids Production Team Individual Investigator Workgroups ~1 Petaflop ~100 Petabytes Interactive User Tools Request Planning & Request Execution & Virtual Data Tools Management Tools Scheduling Tools Resource Other Grid • Resource • Security and • Other Grid Security and Management • Management • Policy • Services Policy Services Services • Services • Services Services Transforms Distributed resources(code, storage, CPUs,networks) Raw data source

  17. MCAT; GriPhyN catalogs MDS MDS GDMP DAGMAN, Kangaroo GSI, CAS Globus GRAM GridFTP; GRAM; SRM Data Grid Architecture Slide courtesy of Ian Foster Application DAG Catalog Services Monitoring Planner Info Services DAG Repl. Mgmt. Executor Policy/Security Reliable Transfer Service Compute Resource Storage Resource

  18. SKC Boston U Wisconsin Michigan PSU BNL Fermilab LBL Argonne J. Hopkins NCSA Indiana Hampton Caltech Oklahoma Vanderbilt UCSD/SDSC FSU Arlington UF Tier1 FIU Tier2 Brownsville Tier3 Slide courtesy of Paul Avery US-iVDGL Data Grid Partners? • EU • CERN • Brazil • Australia • Korea • Japan

  19. ~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Data Grids for High Energy Physics Image courtesy Harvey Newman, Caltech

  20. Global Grid Forum (GGF) • Grid standards • Best practices • Broad academic, national lab, and industry involvement • Areas: Applications and programming environments, architecture, data, information systems and performance, Peer-to-Peer, Scheduling and Resource Management, Security • GGF9 was last week in Chicago

  21. Many Excellent DOE Grid and Middleware Projects • Reliable and Secure Group Communication • Commodity Grid Kits (CoGKits) • Middleware for Science Portals • Scientific Annotation Middleware (SAM) • Storage Resource Management for Data Grid Applications • Common Component Architecture (CCA) • Scalable Software Initiative

  22. Next-Generation Grids • Web Services • “Semantically encapsulate discrete functionality” • Loosely coupled, reusable components • XML, SOAP, WSDL, UDDI, etc. • Broad industrial support: Microsoft, IBM, Sun, BEA, etc. • Open Grid Services Architecture (OGSA) • Combine Grids (Globus, Legion) with Web Services • GT3: Java, AXIS, J2EE, etc.

  23. OGSI.NET • University of Virginia hosting environment for Grid Services based on Microsoft Web Services approach • Focus: Grid security (e.g., explicit trust management) • Focus: Grid programming models • Focus: Connection between UNIX and Win*

  24. Biomolecular VO based on OGSI.NET

  25. Grid Challenges: “UK E-Science Gap Analysis”(Fox and Walker, Jun 30 2003) • Security: VPNs/Firewalls, fine-grain access control • Workflow (“orchestration”) specs and engines • Fault tolerance • Grid adaptability (e.g., real-time support) • Ease of use • Grid federations

  26. Future Directions • Grid has come a long way • Merging of Grid and Web Services shows promise • Many difficult issues remain • Manageable security • Integration with legacy applications/tools • Challenge for SNS: Identify and meet requirements not being met by current Grid technologies

More Related