1.41k likes | 1.62k Views
The Globus Project Infrastructure for Computational Grids. The Globus Project Team http://www.globus.org/. Session Goals. Provide an introduction to… computational grids the capabilities of the Globus Toolkit pragmatic issues with grids & Globus Enable attendees to…
E N D
The Globus ProjectInfrastructure for Computational Grids The Globus Project Team http://www.globus.org/
Session Goals • Provide an introduction to… • computational grids • the capabilities of the Globus Toolkit • pragmatic issues with grids & Globus • Enable attendees to… • start building grids using the Globus Toolkit • start building & using grid applications
Overview • Introduction to computational grids • Introduction to the Globus Toolkit • Portability • Security • Information services • Resource management • Data management • Communication • Case studies • Other Globus services, and future directions
What is a computational grid? • A pool of computational resources that can be “plugged into” via standard interfaces. • Processors • Data storage devices • Instruments • As the power grid is to electrical power, and the telephone grid is to voice communication, so will the computational grid be for computation.
Computational Collaboration • 1975-1995: Collaboration • We work together, but use our own unique systems. • It’s hard to share data, computing power, instrumentation. • 1995-2005: Virtual Organizations • We build systems that combine our resources with those of our collaborators. • We learn how to manage the heterogeneity of systems, management, and users. • 2005-?: Computational Grid • Computation is a commodity, that can be bought and sold by anyone. • Computational services use standard interfaces. • Organizations and individuals typically don’t need to build their own computing infrastructures.
Why do we need the Grid? • The Grid will enable applications that include people, computers, databases, instruments, etc. • Online instruments • Collaborative engineering • Parameter studies • Browsing of remote datasets • Use of remote software • Data-intensive computing • Very large-scale simulation
Online Instruments Advanced Photon Source wide-area dissemination desktop & VR clients with shared controls real-time collection archival storage tomographic reconstruction DOE X-ray source grand challenge: ANL, USC/ISI, NIST, U.Chicago
Collaborative Engineering • Manipulate shared virtual space, with • Simulation components • Multiple flows: Control, Text, Video, Audio, Database, Simulation, Tracking, Haptics, Rendering • Issues: • (un)reliable uni/multicast • Security • Reservation & QoS CAVERNsoft: UIC, Electronic Visualization Laboratory
Control Text Video Audio Database Tele-immersion “5 Gflop/sec, flowspecs, design db” Multiple access modalities Multiple flows • Simulation • Tracking • Haptics • Rendering Leigh et al: UIC, Electronic Visualization Laboratory
Distributed Supercomputing • Issues: • Resource discovery, scheduling • Configuration • Multiple comm methods • Message passing (MPI) • Scalability • Fault tolerance Caltech Exemplar NCSA Origin Maui SP Argonne SP SF-Express Distributed Interactive Simulation: Caltech, USC/ISI
High-Throughput Computing • Schedule many independent tasks • Parameter studies • Data analysis • Issues: • Resource discovery • Data Access • Scheduling • Reservatation • Security • Accounting • Code management Deadline Cost Available Machines Nimrod-G: Monash University
Problem Solving Environments • Examples: • Problem solving env. for computational chemistry • Application web portals • Issues: • Remote job submission, monitoring, and control • Resource discovery • Distributed data archive • Security • Accounting ECCE’: Pacific Northwest National Laboratory
The Grid “Dependable, consistent, pervasive access to[high-end] resources” • Dependable: Can provide performance and functionality guarantees • Consistent: Uniform interfaces to a wide variety of resources • Pervasive: Ability to “plug in” from anywhere
Early Steps Toward the Grid • Metacomputing: late 80s • Focus on distributed computation • Gigabit testbeds: early 90s • Research, primarily on networking • I-WAY: 1995 • Demonstration of application feasibility • NFS PACIs (National Technology Grid): 1998 • NASA Information Power Grid: 1999 • DOE ASCI DISCOM DRM: 1999 • European Grid: 2000
Technical Challenges • Complex application structures that combine aspects of parallel, multimedia, distributed, collaborative computing • Resource characteristics that vary dynamically in both time and space. • Requirements for guaranteed high “end to end” performance, despite heterogeneity and lack of global control. • Desire to retain local policies for security, fees, usage restrictions, management, and technical standards.
Issues • Authenticate once • Specify simulation (code, resources, etc.) • Locate resources • Negotiate authorization, acceptable use, etc. • Acquire resources • Initiate computation • Steer computation • Access remote datasets • Collaborate on results • Account for usage Domain 1 Domain 2
Architectural Approaches • Distributed systems: DCE, CORBA, Jini, etc. • Rich functionality eases app development • Complexity hinders deployment • especially in absence of global control • Performance difficulties • Internet/Web Protocols and Tools • Simple protocols facilitate deployment • Missing functionality hinders app development • Performance difficulties
Standards & Commodity Tech • Where appropriate, exploit standards and commodity technology in core infrastructure • LDAP, SSL/TLS, X.509, GSS-API, HTTP, FTP, XML, SOAP, etc. • Provides leverage • Interface with other common standards • CORBA, Java/Jini, DCOM, Web, etc • While our core infrastructure may not be built on one of these distributed architectures, we can and must cleanly interface with them
The Globus Project • Basic research in grid-related technologies • Resource & data management, security, QoS, policy, communication, adaptation, etc. • Development of the Globus Toolkit • Core services for grid-enabled tools & applications • Construction of production grids & testbeds • Environments in which grid software can be deployed and experiments can be run. • Experimentation with real grid applications • Verifying that the grid works and is useful.
Grid Services Architecture Applications High Energy Physics Data Analysis Online Instrumentation Collab Engineering Climate Studies Application Toolkits High Throughput Remote Control Collab Design Remote Visualization Message Passing Data Intensive Grid Services Information Services Security Data Management Portability Resource Management Fault Detection Grid Fabric Data Transport Control Interfaces Instrumentation Schedulers Operating Systems QoS Services
Globus Project Participants • Globus Project is a large community effort • Globus Toolkit core development • Argonne, USC/ISI, NCSA, SDSC • Globus Toolkit contributors • NASA, DOE ASCI DRM (SNL, LBNL, LLNL), Raytheon, and numerous others • Collaborators • University, lab, industrial, and international partners spanning many scientific and engineering disciplines • Active in Grid Forum • http://www.gridforum.org
Globus Approach • A toolkit and collection of services addressing key technical problems • Modular “bag of services” model • Not a vertically integrated solution • General infrastructure tools (aka middleware) that can be applied to many application domains • Inter-domain issues, rather than clustering • Integration of intra-domain solutions • Distinguish between local and global services
Globus Hourglass • Focus on architecture issues • Propose set of core services as basic infrastructure • Use to construct high-level, domain-specific solutions • Design principles • Keep participation cost low • Enable local control • Support for adaptation • “IP hourglass” model A p p l i c a t i o n s Diverse global services Core Globus services Local OS
Technical Focus & Approach • Enable incremental development of grid-enabled tools and applications • Model neutral: Support many programming models, languages, tools, and applications • Evolve in response to user requirements • Deploy toolkit on international-scale production grids and testbeds • Large-scale application development & testing • Information-rich environment • Basis for configuration and adaptation
Layered Architecture Applications Application Toolkits GlobusView Web Portals DUROC MPICH-G Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM GSI-FTP globus_io HBM GASS GSI MDS Grid Fabric Condor MPI TCP UDP DiffServ Solaris LSF PBS NQE Linux NT
Globus Toolkit Grid Services • Security (GSI) • Information services (MDS) • Resource management (GRAM) • Data management (GASS, GSI-FTP, replicas) • Communication (globus_io, Nexus) • Fault detection (HBM) • Portability (globus_dc, globus_thread)
Other Globus Project Grid Services • Coming Soon • Data transfer (GSI-FTP) • Replica Management http://www.globus.org/datagrid • Experimental Prototypes • Advanced Reservations & QoS (GARA) • Distributed Events & Logging
Sample of High-Level Services • Resource brokers and co-allocators • DUROC, HTB, Nimrod/G, Condor-G, ASCI DRM • Communication & I/O libraries • MPICH-G, PAWS, RIO (MPI-IO), PPFS, MOL • Parallel languages • HPC++, CC++ • Collaborative environments • CAVERNsoft, ManyWorlds • Others • MetaNEOS, NetSolve, LSA, AutoPilot, WebFlow
Condor-G: Condor for the Grid • Condor is a high-throughput scheduler • Condor-G uses Globus Toolkit libraries for: • Security (GSI) • Managing remote jobs on Grid (GRAM) • File staging & remote I/O (GSI-FTP) • Grid job management interface & scheduling • Robust replacement for Globus Toolkit programs • Globus Toolkit focus is on libraries and services, not end user vertical solutions • Supports single or high-throughput apps on Grid • Personal job manager which can exploit Grid resources
Production Grids & Testbeds • Production deployments underway at: • NSF PACIs (National Technology Grid) • NASA Information Power Grid • DOE ASCI • European Grid • Research testbeds • EMERGE: Advance reservation & QoS • GUSTO: Globus Ubiquitous Supercomputing Testbed Organization • Particle Physics Data Grid • Earth Systems Grid
Production Grids & Testbeds NASA’s Information Power Grid The Alliance National Technology Grid GUSTO Testbed
Application Experiments • Computed microtomography (ANL, ISI) • Real-time, collaborative analysis of data from X-Ray source (and electron microscope) • Hydrology (ISI, UMD, UT; also NCSA, Wisc.) • Interactive modeling and data analysis • Collaborative engineering (“tele-immersion”) • CAVERNsoft @ EVL • OVERFLOW (NASA) • Large CFD simulations for aerospace vehicles
Application Experiments • Distributed interactive simulation (CIT, ISI) • Record-setting SF-Express simulation • Cactus • Astrophysics simulation, viz, and steering • Including trans-Atlantic experiments • Particle Physics Data Grid • High Energy Physics distributed data analysis • Earth Systems Grid • Climate modeling data management
Where Are We? (August 2000) • Research is focused on data management, resource management, and web portals. • Globus Toolkit v1.1.3 has been released. • Runs on most versions of Unix, Win32 clients. • Production deployment is underway. • NSF PACIs, NASA IPG, DOE ASCI DRM • Many research applications and tools are using these testbeds. • We’re always looking for interesting applications.
For More Information on Globus http://www.globus.org/ • Papers on most components • Tutorials • User, Developer, Administrator • Manuals • Quick Start Guide, System Administration Guide • Mailing lists • discuss@globus.org, announce@globus.org • Software & API documentation • Application descriptions • Attend Supercomputing 2000 (Nov. 2000)
The Grid:Blueprint for a New Computing InfrastructureI. Foster, C. Kesselman (Eds),Morgan Kaufmann, 1999 • Available July 1998; ISBN 1-55860-475-8 • 22 chapters by expert authors including Andrew Chien, Jack Dongarra, Tom DeFanti, Andrew Grimshaw, Roch Guerin, Ken Kennedy, Paul Messina, Cliff Neuman, Jon Postel, Larry Smarr, Rick Stevens, and many others “A source book for the history of the future” -- Vint Cerf http://www.mkp.com/grids
Session Approach • Five sections, each illustrating a basic Globus service • Laboratory material is available to allow practice with the use of each technique • See http://www.globus.org/tutorial/
Desktop Supercomputing • Seamlessly, from the desktop • Sign-on once • Locate available computers • Start computation on an appropriate system • Monitor progress • Get output files • Manipulate locally • E.g. ECCE’, Cactus, Hotpage, Chemical Eng. Workbench, WebFlow, LSA
WebFlow Grid Interface • Dataflow computing interface to grid computing • Fox, Haupt: Syracuse • Globus services for • Authentication • Process creation and management • Applications include nanomaterials
Application Challenges • Security • How do we authenticate ourselves at the remote site? • Resource specification • How do we locate and request a resource? • Staging of code and data • How do we stage a user’s executables and data to the remote resource? • Computation • How do we start & manage computation?
Grid Services • Single sign-on for all resources • No need for user to keep track of accounts and passwords at multiple sites • No plaintext passwords • Uniform interface to various local scheduling mechanisms • PBS, Condor, LSF, NQE, LoadLeveler, fork, etc. • No need to learn and remember obscure command sequences at different sites • Support for file staging, remote I/O, etc.
Grid Authentication Model • Authentication is done on a “user” basis • Single authentication step allows access to all grid resources • No communication of plaintext passwords • Most sites will use conventional account mechanisms • You must have an account on a resource to use that resource • Sites may use “generic” Grid accounts • Not common, but Globus can deal with it
Grid Security Infrastructure (GSI) • Based on public key technology • Standard X.509 certificate, same as certificates used for the Web • Each user has: • a Grid user id (called a Subject Name) • a private key (like a password) • a certificate signed by a Certificate Authority (CA) • A “gridmap” file at each site specifiesgrid-id to local-id mapping
Certificate Based Authentication • User has a certificate, signed by a trusted “certificate authority” (CA) • Certificate contains users name and public key • Globus project operates a CA • User’s private key is used to encode a challenge string • Public key is used to decode the challenge • If you can decode it, you know the user • Treat your private key carefully!! • Private key is stored in encrypted form
User Proxies • Minimize exposure of user’s private key • A temporary credential for use by our computations • We call this a user proxy certificate • Allows process to act on behalf of user • User-signed user proxy certificate stored in local file • Proxy’s private key is not encrypted • Rely on file system security, proxy certificate file must be readable only by the owner
Delegation • Remote creation of a user proxy • Allows remote process to act on behalf of the user • Avoids sending passwords or private keys across the network
Single sign-onvia “grid-id” User User Proxy Site 1 Process Process GRAM GRAM GSI GSI Process Process Ticket Process Process Public Key Kerberos CREDENTIAL Assignment of credentials to “user proxies” Globus Credential Mutual user-resource authentication Site 2 Mapping to local ids Authenticated interprocess communication GSSAPI: multiple low-level mechanisms Certificate
Globus Authentication Setup • Before you can run Globus applications: • Install Globus • Obtain a Grid certificate and key • Set up your environment so Globus knows where to find certificates and keys • Contact sites to set up local accounts and globusmap entries • Create proxy certificate for each application run • Documentation • Globus Quick Start Guide (on website)
NTP is highly recommended Your New Certificate Certificate: Data: Version: 3 (0x2) Serial Number: 28 (0x1c) Signature Algorithm: md5WithRSAEncryption Issuer: C=US, O=Globus, CN=Globus Certification Authority Validity Not Before: Apr 22 19:21:50 1998 GMT Not After : Apr 22 19:21:50 1999 GMT Subject: C=US, O=Globus, O=NACI, OU=SDSC, CN=Richard Frost Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:bf:4c:9b:ae:51:e5:ad:ac:54:4f:12:52:3a:69: <snip> b4:e1:54:e7:87:57:b7:d0:61 Exponent: 65537 (0x10001) Signature Algorithm: md5WithRSAEncryption 59:86:6e:df:dd:94:5d:26:f5:23:c1:89:83:8e:3c:97:fc:d8: <snip> 8d:cd:7c:7e:49:68:15:7e:5f:24:23:54:ca:a2:27:f1:35:17:
“Logging on” to the Grid • To run programs, authenticate to Globus: % grid-proxy-init Enter PEM pass phrase: ****** • Creates a temporary, short-lived credential for use by our computations Private key is not exposed past grid-proxy-init • Options for grid-proxy-init: -hours <lifetime of credential> -bits <length of key> -help