530 likes | 694 Views
Basic Grid Projects - Globus. Sathish Vadhiyar. Sources/Credits: Project web pages, publications available at Globus site. Some of the figures were also taken from the same. Globus. Open source toolkit used for building Grids Software for Security (GSI) Information infrastructure (MDS)
E N D
Basic Grid Projects - Globus Sathish Vadhiyar Sources/Credits: Project web pages, publications available at Globus site. Some of the figures were also taken from the same
Globus • Open source toolkit used for building Grids • Software for • Security (GSI) • Information infrastructure (MDS) • Resource management (GRAM, job manager, gatekeeper) • Data management (GridFTP, DataGrid) • Communication • Now moving to web services
Grid Security Infrastructure (GSI) • Supports security across organizations. • Single sign-on • Delegation of credentials • Digital signatures based on public key cryptography for verification of messages
Globus/Grid Security Infrastructure (GSI) based on PKI Proxies and delegation (GSI Extensions) for secure single Sign-on • GSI is: Proxies and Delegation SSL/ TLS PKI (CAs and Certificates) SSL for Authentication And message protection PKI for credentials PKI: Public Key Infrastructure, SSL: Secure Socket Layer TLS: Transport Level Security Credits: Globus course material
Verification of messages / digital signatures For assuring the recipient that the information has not been tampered after it has left the sender Encypted hash + message Hash1 = hash(Message) Hash2 = decrypt hash (public key) If Hash1 = Hash2 ? Message Hash(message) Encyrpted hash (private key)
GSI • Every resource identified by a certificate. • Certificate provided and signed by CA • Certificate = resource identity + public key of resource + certificate authority + digital signature of CA • GSI certificates encoded in X.509 certificate formats • Uses SSL for mutual authentication • Parties trust CA’s – possess CA’s public keys
Mutual Authentication CA The parties are assumed to possess CA’s certificates B A I want to communicate. This is my certificate Did CA sign the certificate or is the certificate tempered? Verify digital signature OK. CA signed the certificate. Are you really A or did you steal the certificate from A? Send a random message (challenge, ask for encrypted message and decrypt using A’s public key)
Authentication with Proxy and delegation • Encrypted file for storing private keys. Needs passphrase • Proxy and delegation - More convenience and less security • Also for dynamic delegation for third-party services and dynamic entities • Owner signs proxy certificate • Proxy’s private key is stored in unencrypted files since proxies are for short durations • Chain of trust is established
Mutual Authentication with Proxy B A’s proxy Proxy’s certificate. A’s certificate First validate proxy’s certificate and then owner’s certificate
Steps • Private and public key generated for proxy certificate • User’s private key used to sign proxy certificate (proxy public key) • Proxy certificate and proxy private key stored in a file
Steps • Initiator in Host A and target service in Host B perform mutual authentication • Target service generates new {public, private} key • A signed certificate request is created and sent to the initiator • Initiator uses its private key and signs the public key in the certificate request to form proxy certificate • Proxy certificate sent to target service which stores the certificate and new private key in a file.
Globus Resource Management Architecture • For remote job submission and resource management • Designed to address following problems in metacomputing: • Site autonomy (resource managers) • Co-allocation (co-allocators) • Online control (RSL and resource brokers)
Local resource Management -GRAM • Provides interfaces to local job scheduling mechanisms • Provides mechanisms to map GSI identities to local user accounts • Processes the requests for resources for remote application execution, allocates the required resources, and manages the active jobs. • Also returns updated information regarding the capabilities and availability of the computing resources to the Metacomputing Directory Service (MDS). • Provides an API for submitting and canceling a job request, as well as checking the status of a submitted job.
GRAM • A Gatekeeper runs on the remote host • Creates jobmanager for the job • Gatekeeper: • mutually authenticates with the client, • maps the requestor to a local user, • starts a job manager on the local host as the local user, and • passes the allocation arguments to the newly created job manager. • Jobmanager: • Common component • Machine-specific component
MDS • Meta directory service, Monitoring and discovery service • For publishing and accessing system and application data • Can restrict access to MDS information by using GSI • Interacts with local information services – hour-glass mechanism • Provides caching to minimize transfer of upto-date information and lessen network overhead
MDS Architecture GIIS – Grid Index Information Service GRIS – Grid Resource Information Service
MDS (Monitoring & Directory Service) • Support for multiple information service providers - information providers specified on a per attribute basis • MDS Data: • System information: architecture, OS • Network information • Load status • Additional information sent to GIIS by GRAM reporter • Job status • Queue information • Information viewed through web browser or web client commands
GridFTP • Secure file transfer over Grid • Multiple data channels for parallel transfers – using multiple TCP streams in parallel to improve aggregate bandwidth • Partial file transfers • Third-party (direct server-to-server) transfers by adding GSSAPI security to the existing third-party data transfers in FTP standard – transfers between 2 servers mediated by a third-party client • GSSAPI operations authenticate the third party to the source and destination machines of data transfer
Grid FTP contd… • Authenticated data channels - both GSI and Kerberos security • Reusable data channels • Striped data transfers • Plugin mechanisms for fault tolerance, performance monitoring, and extended data processing
Globus Replica Management Architecture • Replica management • For better performance or availability to accesses • Mainly for access to “published” resources – read-only model • Functions: • Architecture: • Lower level replica catalog API • Higher level replica management API
Replica management service - functions • Registration of files with the replica management service • Creation and deletion of replicas of previously registered files • Enquiries concerning the location and performance characteristics of replicas. • Replica selection based on performance characteristics
Steps in Replica Management • Application queries metadata expressing desired characteristics of logical files • A logical file is returned • Application queries replica catalog for replica instances for the logical file • Storage broker helps to choose a particular replica • Files transferred using GridFTP
Condor + Globus • Ideas?
Condor-G • Condor – Condor Job management protocols + Condor resource management protocols • Condor-G – Condor job management protocols + Globus resource management protocols • For easy job management instead of managing through Globus mechanisms
Condor job manager in Globus • To allow external users to use Globus mechanisms to access a Condor pool
Glidein • Temporarily add a Globus resource to a Condor pool • Globus is used to run Condor software • Allows checkpointing, migration and remote system calls
Globus References / sources / credits • A Resource Management Architecture for Metacomputing Systems. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke. Proc. IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, pg. 62-82, 1998.Describes the resource management architecture implemented as part of the Globus system. • A Distributed Resource Management Architecture that Supports Advance Reservations and Co-Allocation. I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, A. Roy. Intl Workshop on Quality of Service, 1999.Describes the new Globus Architecture for Reservation and Allocation, which integrates CPU and network QoS.
Globus References / sources / credits • A Security Architecture for Computational Grids. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke. Proc. 5th ACM Conference on Computer and Communications Security Conference, pp. 83-92, 1998.Describes techniques for authentication in wide area computing environments. • http://www.globus.org/Security/papers/pki04-welch-proxy-cert-final.pdf • A National-Scale Authentication Infrastructure. R. Butler, D. Engert, I. Foster, C. Kesselman, S. Tuecke, J. Volmer, V. Welch. IEEE Computer, 33(12):60-66, 2000.Describes our experience designing, developing, and deploying the Grid Security Infrastructure.
Globus References / sources / credits • Grid Information Services for Distributed Resource Sharing. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman. Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001. • Usage of LDAP in Globus. I. Foster, G. von Laszewski.This short note describes the use of LDAP in the Globus toolkit. It answers three questions: What is LDAP? Where is it used? and Why is it used in Globus? • A Directory Service for Configuring High-Performance Distributed Computations. S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, S. Tuecke. Proc. 6th IEEE Symposium on High-Performance Distributed Computing, pp. 365-375, 1997.Describes the Metacomputing Directory Service used to maintain information about Globus components.
Globus References / sources / credits • The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke. Journal of Network and Computer Applications, 23:187-200, 2001 (based on conference publication from Proceedings of NetStore Conference 1999). • Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke. IEEE Mass Storage Conference, 2001.Presents the design and performance characteristics of two fundamental technologies for data management. • Replica Selection in the Globus Data Grid. S. Vazhkudai, S. Tuecke, I. Foster. Proceedings of the First IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), pp. 106-113, IEEE Computer Society Press, May 2001.Discusses a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives.
GSS API • GSI implemented using GSS-API • GSS API provides both transport and mechanism independence. • Provides functions for obtaining credentials, performing authentication, signing messages and encrypting messages • GSI – X.509 public key certification, public key infrastructure, SSL protocol, X.509 proxy certificates
X.509 Proxy Certificates • To allow users to: • Create identities for new entities dynamically and light-weight • Delegate privileges to those entities dynamically • Perform single sign-on • Proxy certificate • Subject name (identity) – scoped by the subject name of the issuer • Public key – different from subject’s public key • PCI – Proxy Certificate Information – policy method identifier + policy field
DUROC • Dynamically-Updated Request Online Coallocator • coallocator is used to coordinate transactions with each of the RMs and bring up the distributed pieces of the job
RSL spec. • E.g.: Multi-request
MDS • Contains entries where each entry is associated with one or more attribute:value pairs • Each entry associated with a distinguished name.
Replica management • Globus Replica Management integrates the Globus Replica Catalog (for keeping track of replicated files) and GridFTP (for moving data) and provides replica management capabilities for data grids. • The globus_replica_management library provides client functions that allow files to be registered with the replica management service, published to replica locations, and moved among multiple locations. • Managing the copying and placement of files in a distributed computing system so as to improve the performance of data analysis