540 likes | 688 Views
Tuesday , February 24, 2009. Wisdom consists of knowing when to avoid perfection. Horowitz. The Globus Project Infrastructure for Computational Grids. The Globus Project Team http://www.globus.org/. Session Goals. Provide an introduction to… computational grids
E N D
Tuesday, February 24, 2009 Wisdom consists of knowing when to avoid perfection. • Horowitz
The Globus ProjectInfrastructure for Computational Grids The Globus Project Team http://www.globus.org/
Session Goals • Provide an introduction to… • computational grids • the capabilities of the Globus Toolkit • pragmatic issues with grids & Globus • Enable attendees to… • start building grids using the Globus Toolkit • start building & using grid applications
Overview • Introduction to computational grids • Introduction to the Globus Toolkit • Portability • Security • Information services • Resource management • Data management • Communication • Case studies • Other Globus services, and future directions
What is a computational grid? • A pool of computational resources that can be “plugged into” via standard interfaces. • Processors • Data storage devices • Instruments • As the power grid is to electrical power, and the telephone grid is to voice communication, so will the computational grid be for computation.
Globus Project Participants • Globus Project is a large community effort • Globus Toolkit core development • Argonne, USC/ISI, NCSA, SDSC • Globus Toolkit contributors • NASA, DOE ASCI DRM (SNL, LBNL, LLNL), Raytheon, and numerous others • Collaborators • University, lab, industrial, and international partners spanning many scientific and engineering disciplines • Active in Grid Forum • http://www.gridforum.org
Globus Approach • A toolkit and collection of services addressing key technical problems • Modular “bag of services” model • Not a vertically integrated solution • General infrastructure tools (aka middleware) that can be applied to many application domains • Inter-domain issues, rather than clustering • Integration of intra-domain solutions • Distinguish between local and global services
Globus Hourglass • Focus on architecture issues • Propose set of core services as basic infrastructure • Use to construct high-level, domain-specific solutions • Design principles • Keep participation cost low • Enable local control • Support for adaptation • “IP hourglass” model A p p l i c a t i o n s Diverse global services Core Globus services Local OS
Technical Focus & Approach • Enable incremental development of grid-enabled tools and applications • Model neutral: Support many programming models, languages, tools, and applications • Evolve in response to user requirements • Deploy toolkit on international-scale production grids and testbeds • Large-scale application development & testing • Information-rich environment • Basis for configuration and adaptation
Layered Architecture Applications Application Toolkits GlobusView Web Portals DUROC MPICH-G Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM GSI-FTP globus_io HBM GASS GSI MDS Grid Fabric Condor MPI TCP UDP DiffServ Solaris LSF PBS NQE Linux NT
Globus Toolkit Grid Services • Security (GSI) • Information services (MDS) • Resource management (GRAM) • Data management (GASS, GSI-FTP, replicas) • Communication (globus_io, Nexus) • Fault detection (HBM) • Portability (globus_dc, globus_thread)
Other Globus Project Grid Services • Coming Soon • Data transfer (GSI-FTP) • Replica Management http://www.globus.org/datagrid • Experimental Prototypes • Advanced Reservations & QoS (GARA) • Distributed Events & Logging
Sample of High-Level Services • Resource brokers and co-allocators • DUROC, HTB, Nimrod/G, Condor-G, ASCI DRM • Communication & I/O libraries • MPICH-G, PAWS, RIO (MPI-IO), PPFS, MOL • Parallel languages • HPC++, CC++ • Collaborative environments • CAVERNsoft, ManyWorlds • Others • MetaNEOS, NetSolve, LSA, AutoPilot, WebFlow
Condor-G: Condor for the Grid • Condor is a high-throughput scheduler • Condor-G uses Globus Toolkit libraries for: • Security (GSI) • Managing remote jobs on Grid (GRAM) • File staging & remote I/O (GSI-FTP) • Grid job management interface & scheduling • Robust replacement for Globus Toolkit programs • Globus Toolkit focus is on libraries and services, not end user vertical solutions • Supports single or high-throughput apps on Grid • Personal job manager which can exploit Grid resources
Production Grids & Testbeds • Production deployments underway at: • NSF PACIs (National Technology Grid) • NASA Information Power Grid • DOE ASCI • European Grid • Research testbeds • EMERGE: Advance reservation & QoS • GUSTO: Globus Ubiquitous Supercomputing Testbed Organization • Particle Physics Data Grid • Earth Systems Grid
Production Grids & Testbeds NASA’s Information Power Grid The Alliance National Technology Grid GUSTO Testbed
Application Experiments • Computed microtomography (ANL, ISI) • Real-time, collaborative analysis of data from X-Ray source (and electron microscope) • Hydrology (ISI, UMD, UT; also NCSA, Wisc.) • Interactive modeling and data analysis • Collaborative engineering (“tele-immersion”) • CAVERNsoft @ EVL • OVERFLOW (NASA) • Large CFD simulations for aerospace vehicles
Application Experiments • Distributed interactive simulation (CIT, ISI) • Record-setting SF-Express simulation • Cactus • Astrophysics simulation, viz, and steering • Including trans-Atlantic experiments • Particle Physics Data Grid • High Energy Physics distributed data analysis • Earth Systems Grid • Climate modeling data management
Where Are We? (August 2000) • Research is focused on data management, resource management, and web portals. • Globus Toolkit v4 has been released. • Runs on most versions of Unix, Win32 clients. • Production deployment is underway. • NSF PACIs, NASA IPG, DOE ASCI DRM • Many research applications and tools are using these testbeds. • We’re always looking for interesting applications.
For More Information on Globus http://www.globus.org/ • Papers on most components • Tutorials • User, Developer, Administrator • Manuals • Quick Start Guide, System Administration Guide • Mailing lists • discuss@globus.org, announce@globus.org • Software & API documentation • Application descriptions • Attend Supercomputing 2000 (Nov. 2000)
The Grid:Blueprint for a New Computing InfrastructureI. Foster, C. Kesselman (Eds),Morgan Kaufmann, 1999 • Available July 1998; ISBN 1-55860-475-8 • 22 chapters by expert authors including Andrew Chien, Jack Dongarra, Tom DeFanti, Andrew Grimshaw, Roch Guerin, Ken Kennedy, Paul Messina, Cliff Neuman, Jon Postel, Larry Smarr, Rick Stevens, and many others “A source book for the history of the future” -- Vint Cerf http://www.mkp.com/grids
Session Approach • Five sections, each illustrating a basic Globus service • Laboratory material is available to allow practice with the use of each technique • See http://www.globus.org/tutorial/
Desktop Supercomputing • Seamlessly, from the desktop • Sign-on once • Locate available computers • Start computation on an appropriate system • Monitor progress • Get output files • Manipulate locally • E.g. ECCE’, Cactus, Hotpage, Chemical Eng. Workbench, WebFlow, LSA
WebFlow Grid Interface • Dataflow computing interface to grid computing • Fox, Haupt: Syracuse • Globus services for • Authentication • Process creation and management • Applications include nanomaterials
Application Challenges • Security • How do we authenticate ourselves at the remote site? • Resource specification • How do we locate and request a resource? • Staging of code and data • How do we stage a user’s executables and data to the remote resource? • Computation • How do we start & manage computation?
Grid Services • Single sign-on for all resources • No need for user to keep track of accounts and passwords at multiple sites • No plaintext passwords • Uniform interface to various local scheduling mechanisms • PBS, Condor, LSF, NQE, LoadLeveler, fork, etc. • No need to learn and remember obscure command sequences at different sites • Support for file staging, remote I/O, etc.
Grid Authentication Model • Authentication is done on a “user” basis • Single authentication step allows access to all grid resources • No communication of plaintext passwords • Most sites will use conventional account mechanisms • You must have an account on a resource to use that resource • Sites may use “generic” Grid accounts • Not common, but Globus can deal with it
Grid Security Infrastructure (GSI) • Based on public key technology • Standard X.509 certificate, same as certificates used for the Web • Each user has: • a Grid user id (called a Subject Name) • a private key (like a password) • a certificate signed by a Certificate Authority (CA) • A “gridmap” file at each site specifiesgrid-id to local-id mapping
Certificate Based Authentication • User has a certificate, signed by a trusted “certificate authority” (CA) • Certificate contains users name and public key • Globus project operates a CA • User’s private key is used to encode a challenge string • Public key is used to decode the challenge • If you can decode it, you know the user • Treat your private key carefully!! • Private key is stored in encrypted form
User Proxies • Minimize exposure of user’s private key • A temporary credential for use by our computations • We call this a user proxy certificate • Allows process to act on behalf of user • User-signed user proxy certificate stored in local file • Proxy’s private key is not encrypted • Rely on file system security, proxy certificate file must be readable only by the owner
Delegation • Remote creation of a user proxy • Allows remote process to act on behalf of the user • Avoids sending passwords or private keys across the network
Single sign-onvia “grid-id” User User Proxy Site 1 Process Process GRAM GRAM GSI GSI Process Process Ticket Process Process Public Key Kerberos CREDENTIAL Assignment of credentials to “user proxies” Globus Credential Mutual user-resource authentication Site 2 Mapping to local ids Authenticated interprocess communication GSSAPI: multiple low-level mechanisms Certificate
Globus Authentication Setup • Before you can run Globus applications: • Install Globus • Obtain a Grid certificate and key • Set up your environment so Globus knows where to find certificates and keys • Contact sites to set up local accounts and globusmap entries • Create proxy certificate for each application run • Documentation • Globus Quick Start Guide (on website)
Simple job submission • globus-job-run provides a simple RSH compatible interface% grid-proxy-init Enter PEM pass phrase: *****% globus-job-run host program [args] • Job submission will be covered in more detail in Part 5
Exercise: Sign-On & Remote Process Creation • Use grid-proxy-init to create a proxy certificate: % grid-proxy-init Enter PEM pass phrase: ......................................+++++ .....+++++ • Use grid-proxy-info to query proxy:% grid-proxy-info -subject • Use globus-job-run to start remote programs:% globus-job-run jupiter.isi.edu /usr/bin/ls -l /tmp
Globus Components Being Used • GSI: Grid Security Infrastructure • Authenticate to remote system • GRAM: Globus Resource Allocation Manager • Create process on remote resource, deal with local resource managers • GASS: Global Access to Secondary Storage • Redirect standard output (More on GRAM and GASS later!)
Job Submission Interfaces • Globus Toolkit includes several command line programs for job submission • globus-job-run: Interactive jobs • globus-job-submit: Batch/offline jobs • globusrun: Flexible scripting infrastructure • Others are building better interfaces • General purpose • Condor-G, PBS, GRD, Hotpage, etc • Application specific • ECCE’, Cactus, Web portals
globus-job-run • For running of interactive jobs • Additional functionality beyond rsh • Ex: Run 2 process job w/ executable staging globus-job-run -: host –np 2 –s myprog arg1 arg2 • Ex: Run 5 processes across 2 hosts globus-job-run \ -: host1 –np 2 –s myprog.linux arg1 \ -: host2 –np 3 –s myprog.aix arg2 • For list of arguments run: globus-job-run -help
globus-job-submit • For running of batch/offline jobs • globus-job-submit Submit job • Same interface as globus-job-run • Returns immediately • globus-job-status Check job status • globus-job-cancel Cancel job • globus-job-get-output Get job stdout/err • globus-job-clean Cleanup after job
globusrun • Flexible job submission for scripting • Uses an RSL string to specify job request • Contains an embedded globus-gass-server • Defines GASS URL prefix in RSL substitution variable: (stdout=$(GLOBUSRUN_GASS_URL)/stdout) • Supports both interactive and offline jobs • Complex to use • Must write RSL by hand • Must understand its esoteric features • Generally you should use globus-job-* commands instead
Summary • Grid Security Infrastructure (GSI) provides single sign-on capability • globus-job-run can be used to create a remote process • Difference between schedulers managed by Globus • Strong authentication provided • Remote process creation can be added to applications by using Globus services • Rest [Self Study]
MDS Features • White Pages • Look up the IP number, amount of memory, etc., associated with a particular machine • Yellow Pages • Search for computers of a particular class or with a particular property • Information is dynamic! • In a distributed system, things change without warning. • Information often has an expiration date or other measures of uncertainty.
MDS Approach • Based on LDAP • Lightweight Directory Access Protocol v3 (LDAPv3) • Standard data model • Standard query protocol • Globus specific schema • Host-centric representation • Globus specific tools • GRIS, GIIS • Data discovery, publication,… Application Middleware LDAP API … GIIS GRIS … SNMP NWS NIS LDAP
LDAP Details • Lightweight Directory Access Protocol • Stripped down version of X.500 DAP protocol • Supports distributed storage/access (referrals) • Supports replication • Becoming de facto standard • Defines: • Network protocol for accessing directory contents • Information model defining form of information • Namespace defining how information is referenced and organized
LDAP Directory Structure • Directory contents • Called Object Classes and Entries • What information is stored in directory • Group related information into entries • Directory organization • Called Directory Information Tree (DIT) • Objects are organized into tree structure • Position of object in tree uniquely names entry within the server
Compute Resources Operating System Memory Hierarchy Health and Status Network Interfaces IP address Interface types Performance Data Schedule Jobs CPU Loads Network Traffic Resource Managers Contact strings Scheduled jobs Free nodes Software Configuration Version Control Contact information Organizations People Sample Object Classes
LDAP Directory Information Tree • Directory entries are organized into a tree. • Called Directory Information Tree (DIT) • Subtrees can be distributed or replicated. • Position in tree uniquely names entry within a server. • Each object in a server is uniquely determined by its distinguished name (DN). • List of unique attribute names and values along path from root of DIT to object, e.g.: <hn=sp2.sdsc.edu, dc=sdsc, dc=edu, o=Grid>
MDS Tools • Java LDAP browser • http://www.mcs.anl.gov/~gawor/ldap • Web-based browsers and displays • http://www.globus.org/mds • CGI-based MDS browser • MDS Object Class Browser • Various APIs and search tools • Translators from “Globus Object Definition Language” • Commented LDIF - LDAP schema definition • Converts to LDIF, HTML
MDS Access/Update Commands • LDAP defines a set of standard commands ldapsearch, ldapmodify, ldapdelete, etc. • We also define MDS-specific commands • grid-info-search, grid-info-create, grid-info-update, grid-info-remove, grid-info-host-search • Routines to ensure data consistency and to insert metadata • APIs are defined for C, Java, etc. • C: OpenLDAP client API • ldap_search_s(), ldap_modify_s(), … • Java: JNDI