1 / 36

GRIDS Center Overview John McGee, USC/ISI

GRIDS Center Overview John McGee, USC/ISI. NSF Middleware Initiative. June 26, 2002 Internet2 – Base CAMP Boulder, Colorado. GRIDS Center G rid R esearch I ntegration D evelopment & S upport http://www.grids-center.org USC/ISI - Chicago - NCSA – SDSC - Wisconsin. Agenda.

grodgers
Download Presentation

GRIDS Center Overview John McGee, USC/ISI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRIDS Center OverviewJohn McGee, USC/ISI NSF Middleware Initiative June 26, 2002 Internet2 – Base CAMP Boulder, Colorado

  2. GRIDS Center Grid Research Integration Development & Support http://www.grids-center.org USC/ISI - Chicago - NCSA – SDSC - Wisconsin

  3. Agenda • Vision for Grid Technology • GRIDS Center Operations • Software Components • Packaging and Testing • Documentation and Support • Testbed • Globus Security and Resource Discovery • Campus Enterprise Integration

  4. Vision for Grid Technologies

  5. Enabling Seamless Collaboration • GRIDS help distributed communities pursue common goals • Scientific research • Engineering design • Education • Artistic creation • Focus is on the enabling mechanisms required for collaboration • Resource sharing as a fundamental concept

  6. Grid Computing Rationale • The need for flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources See “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” by Foster, Kesselman, Tuecke at http://www.globus.org (in the “Publications” section) • The need for communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals while assuming the absence of: • central location • central control • omniscience • existing trust relationships

  7. Elements of Grid Computing • Resource sharing • Computers, storage, sensors, networks • Sharing is always conditional, based on issues of trust, policy, negotiation, payment, etc. • Coordinated problem solving • Beyond client-server: distributed data analysis, computation, collaboration, etc. • Dynamic, multi-institutional virtual organizations • Community overlays on classic org structures • Large or small, static or dynamic

  8. Resource-Sharing Mechanisms • Should address security and policy concerns of resource owners and users • Should be flexible and interoperable enough to deal with many resource types and sharing modes • Should scale to large numbers of resources, participants, and/or program components • Should operate efficiently when dealing with large amounts of data & computational power

  9. Grid Applications • Science portals • Help scientists overcome steep learning curves of installing and using new software • Solve advanced problems by invoking sophisticated packages remotely from Web browsers or thin clients • Portals are currently being developed in biology, fusion, computational chemistry, and other disciplines • Distributed computing • High-speed workstations and networks can yoke together an organization's PCs to form a substantial computational resource

  10. Mathematicians Solve NUG30 • Looking for the solution to the NUG30 quadratic assignment problem • An informal collaboration of mathematicians and computer scientists • Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites) • 14,5,28,24,1,3,16,15, • 10,9,21,2,4,29,25,22, • 13,26,17,30,6,20,19, • 8,18,7,27,12,11,23 MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

  11. More Grid Applications • Large-scale data analysis • Science increasingly relies on large datasets that benefit from distributed computing and storage • Computer-in-the-loop instrumentation • Data from telescopes, synchrotrons, and electron microscopes are traditionally archived for batch processing • Grids are permitting quasi-real-time analysis that enhances the instruments’ capabilities • E.g., with sophisticated “on-demand” software, astronomers may be able to use automated detection techniques to zoom in on solar flares as they occur

  12. ~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Data Grids forHigh Energy Physics

  13. Still More Grid Applications • Collaborative work • Researchers often want to aggregate not only data and computing power, but also human expertise • Grids enable collaborative problem formulation and data analysis • E.g., an astrophysicist who has performed a large, multi-terabyte simulation could let colleagues around the world simultaneously visualize the results, permitting real-time group discussion • E.g., civil engineers collaborate to design, execute, & analyze shake table experiments

  14. Tier0/1 facility Tier2 facility Tier3 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link iVDGL:International Virtual Data Grid Laboratory U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

  15. The 13.6 TF TeraGrid:Computing at 40 Gb/s Site Resources Site Resources 26 HPSS HPSS 4 24 External Networks External Networks 8 5 Caltech Argonne External Networks External Networks NCSA/PACI 8 TF 240 TB SDSC 4.1 TF 225 TB Site Resources Site Resources HPSS UniTree TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

  16. Portal Example • NPACI HotPage • https://hotpage.npaci.edu/

  17. Software Components

  18. General Approach • Define Grid protocols & APIs • Protocol-mediated access to remote resources • Integrate and extend existing standards • “On the Grid” = speak “Intergrid” protocols • Develop a reference implementation • Open source Globus Toolkit • Client and server SDKs, services, tools, etc. • Grid-enable wide variety of tools • Globus Toolkit, FTP, SSH, Condor, SRB, MPI, … • Learn through deployment and applications

  19. Software Components • GRIDS Center software is a collection of packages developed in the academic research community • Protocol and architecture approach • Reference implementations • Each package has at least 2 production level implementations before inclusion into the Grids Center Software Suite

  20. The Hourglass Model • Focus on architecture issues • Propose set of core services as basic infrastructure • Use to construct high-level, domain-specific solutions • Design principles • Keep participation cost low • Enable local control • Support for adaptation • “IP hourglass” model A p p l i c a t i o n s Diverse global services Core services Local OS

  21. Software Components • Globus Toolkit • Core Grid computing toolkit • Condor-G • Advanced job submission and management infrastructure • Network Weather Service • Network capability prediction • KX.509 / KCA (NMI-EDIT) • Kerberos to PKI

  22. The Globus Toolkit™ • The de facto standard for Grid computing • A modular “bag of technologies” addressing key technical problems facing Grid tools, services and applications • Made available under liberal open source license • Simplifies collaboration across virtual organizations • Authentication • Grid Security Infrastructure (GSI) • Scheduling • Globus Resource Allocation Manager (GRAM) • Dynamically Updated Request Online Coallocator (DUROC) • Resource description • Monitoring and Discovery Service (MDS) • File transfer • Global Access to Secondary Storage (GASS) • GridFTP

  23. Condor-G • NMI-R1 will include Condor-G, an enhanced version of the core Condor software optimized to work with Globus Toolkit™ for managing Grid jobs

  24. Network Weather Service • From UC Santa Barbara, NWS monitors and dynamically forecasts performance of network and computational resources • Uses a distributed set of performance sensors (network monitors, CPU monitors, etc.) for instantaneous readings • Numerical models’ ability to predict conditions is analogous to weather forecasting – hence the name • For use with the Globus Toolkit and Condor, allowing dynamic schedulers to provide statistical Quality-of-Service readings • NWS forecasts end-to-end TCP/IP performance (bandwidth and latency), available CPU percentage and available non-paged memory • NWS automatically identifies the best forecasting technique for any given resource

  25. KX.509 for Converting Kerberos Certificates to PKI • Stand-alone client program from the University of Michigan • For a Kerberos-authenticated user, KX.509 acquires a short-term X.509 certificate that can be used by PKI applications • Stores the certificate in the local user's Kerberos ticket file • Systems that already have a mechanism for removing unused kerberos credentials may also automatically remove the X.509 credentials • Web browser may then load a library (PKCS11) to use these credentials for https • The client reads X.509 credentials from the user’s Kerberos cache and converts them to PEM, the format used by the Globus Toolkit

  26. GRIDS Software Packaging • Grids Center software uses the Grid Packaging Technology 2.0 • Perl-based tool eases user installation and setup • GPT2: new version enables creation of RPMs • Lets users install from binaries with familiar packaging • Includes database of all packages, useful for verifying installations • Packaging enables: • Dependency checking • User customization of configuration • Easy upgrades, patches

  27. Software Testing • University of Wisconsin is in charge of testing the GRIDS software for NMI releases • Platforms to date: • RedHat 7.2 on IA 32 • Solaris 8.0 on SPARC • Release 2 additions: • RedHat 7.2 on IA 64 • AIX-L • Testing includes: • Builds • Quality assurance • Interoperability of GRIDS components

  28. Technical Support • First-level tech support handled at NCSA • One-stop-shop address for users: • nmi-support@nsf-middleware.org • All queries go to NCSA, which responds within 24 hours • Help requests that NCSA can’t answer get forwarded to people responsible for each of the components: • Globus Toolkit (U.of Chicago/Argonne/ISI) • Condor-G (U. of Wisconsin) • Network Weather Service (UC Santa Barbara) • KX.509 (Michigan) • PubCookie (U. Washington) • CPM

  29. Integration Issues • NMI testbed sites will be early adopters, seeking integration of campus infrastructure and Grid computing • Via NMI partnerships, GRIDS will help identify points of intersection and divergence between Grid and enterprise computing • Authorization, authentication and security • Directory services • Emphasis is on open standards and architectures as the route to successful collaboration

  30. A few specifics on theGlobus Toolkit

  31. Grid Security Infrastructure (GSI) • Globus Toolkit implements GSI protocols and APIs, to address Grid security needs • GSI protocols extends standard public key protocols • Standards: X.509 & SSL/TLS • Extensions: X.509 Proxy Certificates & Delegation • GSI extends standard GSS-API

  32. Generic Security Service API • The GSS-API is the IETF draft standard for adding authentication, delegation, message integrity, and message protection to apps • For secure communication between two parties over a reliable channel (e.g. TCP) • GSS-API separates security from communication, which allows security to be easily added to existing communication code. • Effectively placing transformation filters on each end of the communication link • Globus Toolkit components all use GSS-API

  33. Delegation • Delegation = remote creation of a (second level) proxy credential • New key pair generated remotely on server • Proxy cert and public key sent to client • Clients signs proxy cert and returns it • Server (usually) puts proxy in /tmp • Allows remote process to authenticate on behalf of the user • Remote process “impersonates” the user

  34. Limited Proxy • During delegation, the client can elect to delegate only a “limited proxy”, rather than a “full” proxy • GRAM (job submission) client does this • Each service decides whether it will allow authentication with a limited proxy • Job manager service requires a full proxy • GridFTP server allows either full or limited proxy to be used

  35. Sample Gridmap File • Gridmap file maintained by Globus administrator • Entry maps Grid-id into local user name(s) # Distinguished name Local # username "/C=US/O=Globus/O=NPACI/OU=SDSC/CN=Rich Gallup” rpg "/C=US/O=Globus/O=NPACI/OU=SDSC/CN=Richard Frost” frost "/C=US/O=Globus/O=USC/OU=ISI/CN=Carl Kesselman” u14543 "/C=US/O=Globus/O=ANL/OU=MCS/CN=Ian Foster” itf

  36. Security Issues • GSI handles authentication, but authorization is a separate issue. • Management of authorization on a multi-organization grid is still an interesting problem. • The grid-mapfile doesn’t scale well, and works only at the resource level, not the collective level. • Data access exacerbates authorization issues, which has led us to CAS…

More Related