1 / 201

GridPrimer

GridPrimer. An Introduction to the world of Grid Computing. Jon MacLaren Monday 18 th to Friday 22 nd October 2004 GridPrimer Training Course University of Manchester. Computationally intensive File access/transfer Bag of various heterogeneous protocols & toolkits

caesar
Download Presentation

GridPrimer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridPrimer An Introduction to the world of Grid Computing Jon MacLaren Monday 18th to Friday 22nd October 2004 GridPrimer Training Course University of Manchester

  2. Computationally intensive File access/transfer Bag of various heterogeneous protocols & toolkits Monolithic design Recognised internet, ignored Web Academic teams X.509, LDAP, FTP, … Globus Toolkit Condor, Unicore Defacto standards GridFTP, GSI Generation Game App-specific Services Open Grid Services Architecture Web services Increased functionality, standardization Data and knowledge intensive Open services-based architecture Builds on Web services GGF + OASIS+W3C Multiple implementations Global Grid Forum Industry participation Custom solutions Time (adapted from Ian Foster GGF7 Plenary) e-Science NorthWest

  3. Grid Security Introduction What is security on the Grid about? What does it do? What are the problems?

  4. What is security about? • It’s about communicating and collaborating securely in an insecure environment! • What do we need to ensure? • The privacy of messages being exchanged • The integrity of messages being exchanged • We know someone is who they say they are • We know what the person is “entitled” to do • We can remembering what the person did • Solutions? • Encryption • Signing • Authentication • Authorization • Accounting e-Science NorthWest

  5. Public-key cryptography • “The problems of authentication and large network privacy protection were addressed theoretically in 1976 by Whitfield Diffie and Martin Hellman when they published their concepts for a method of exchanging secret messages without exchanging secret keys. The idea came to fruition in 1977 with the invention of the RSA Public Key Cryptosystem by Ronald Rivest, Adi Shamir, and Len Adleman, then professors at the Massachusetts Institute of Technology.” • So the RSA Algorithm (and others, e.g. DSA) allow you to create two keys with the following properties: • data encrypted with the first key can only be decrypted with the second key • data encrypted with the second key can only be decrypted with the first key • Given one of the keys, it is extremely difficult to work out what the other key is. e-Science NorthWest

  6. How difficult is extremely difficult? • Can still attack key pairs using brute force methods • On August 22, 1999,  a group of researchers completed the factorization of the 155 digit (512 bit) RSA Challenge Number.  The work was accomplished with the General Number Field Sieve.  The sieving software used two different sieve techniques: line sieving and lattice sieving.  The sieving was accomplished as follows: • Sieving: 35.7 CPU-years in total on... • 160 175-400 MHz SGI and Sun workstations • 8 250 MHz SGI Origin 2000 processors • 120 300-450 MHz Pentium II PCs • 4 500 MHz Digital/Compaq boxes e-Science NorthWest

  7. Should we be worried? • For RSA, 576-bit is the longest to be broken (Dec 2003) • 1024 bit still far off (maybe?). If you can do it, they’ll give you $100,000! • See RSA Challenge Numbers: • http://www.rsasecurity.com/rsalabs/node.asp?id=2093 • Many Certificate Authorities (more later) recommend 2048-bit key-pairs, which can be easily generated • New algorithms are being developed based on the mathematics of elliptic curves (Elliptic Curve Cryptography). • Longest key-pair broken is a 109 bit key in 2003, using the “birthday attack”, with over 10,000 Pentium class PCs running continuously for over 540 days. • The minimum recommended key size for ECC is 163 bits, is currently estimated to require 108 times the computing resources as that required for the 109 bit problem. e-Science NorthWest

  8. Watch those random numbers! • To generate “good” key-pairs, you need good random numbers! • In 1995, Goldberg and Wagner from Berkley discovered that random numbers in netscape were being generated using only three pieces of information: • time of day • process ID • parent process ID • This resulted in some easy attacks, particularly on shared machines. See: • http://www.cs.berkeley.edu/~daw/netscape-randomness.html • Can use /dev/random, or PRND (Pseudo-Random Number Generator), Entropy Gathering Daemon, etc. e-Science NorthWest

  9. Public Key CryptographyEncryption (Scary Maths!) • Private Key: [d and N] • where • Public Key: [e and N] • where N=pq product of two large primes • (p-1)(q-1) is almost prime • and e (almost prime too) • To encrypt/decrypt with Public Key: • To decrypt/encrypt with Private Key:

  10. Signing • Goal of signing is to leave something readable by everyone, but allow the signature to be verified • So, we don’t encrypt the message • Instead we create a Hash and encrypt that instead. • Hash is a one way digest of the message by a specific algorithm (e.g. SHA1 or MD5) • The encrypted hash is then sent with the message • Verify the signature by making the hash and comparing this with the signature as decrypted using the sender’s public key

  11. PKI – Public Key Infrastructure • Typically based on X509 certificates • Supports different key-pair algorithms • We rely on ourselves to get true public keys • Chain of trust rules • A public key may be digitally signed by many people • some of whom you may trust. • CA method (Certificate Authority) • CA has a “root certificate” and a document called CP/CPS http://www.grid-support.ac.uk/ca/cps • You choose to trust on the basis of CP/CPS. • CA signs your certificate (your public key). • Large scale CAs are difficult and costly (~£220 per cert) Policy and Practice

  12. Getting Certificates • Create a private and public key pair • Send public key to CA • Identify yourself to the CA (as specified in CPS) • CA signs your public key. • CA sends you a digital certificate which contains your public key and the CA's digital signature • Can be done two ways: • in your browser Netscape/IE certificate request • on the command line: e.g. grid-cert-request

  13. The UK eScience Certificate Authority • Read CPS • Get CA cert • Get CRL • Request a certificate • CertDB • Export Certs Gets you an x509 cert

  14. x509 Certificates Certificate: Data: Version: 3 (0x2) Serial Number: 127 (0x7f) Signature Algorithm: md5WithRSAEncryption Issuer: C=UK, O=eScience, OU=Authority, CN=CA/Email=ca-operator@grid-support.ac.uk Validity Not Before: Oct 31 15:50:59 2002 GMT Not After : Oct 31 15:50:59 2003 GMT Subject: C=UK, O=eScience, OU=Manchester, L=MC, CN=michael jones Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:c6:96:fd:7a:e0:fa:f1:e6:43:9d:c1:cb:72:38: e1:4e:44:86:da:a7:8a:ed:8a:fc:f3:64:d8:9e:bd: af:ce:7c:55:39:cd:61:74:a8:1d:6d:60:6e:65:91: dc:2c:c2:64:80:f6:f9:1a:3c:fe:d4:d2:1c:52:fa: c6:47:ea:a6:4e:92:b5:c9:1d:93:dd:48:61:54:40: b5:17:84:3f:5c:47:48:29:2b:83:82:c7:d6:ad:d3: 60:5d:6d:5d:f7:08:25:17:d2:14:e2:8e:af:37:3b: e4:3b:63:f7:31:24:b4:66:78:8e:06:93:c6:8d:b6: fe:50:79:3a:4a:f8:59:58:3d Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Basic Constraints: CA:FALSE Netscape Cert Type: SSL Client, S/MIME X509v3 Key Usage: Digital Signature, Non Repudiation, Key Encipherment, Key Agreement Netscape Comment: UK e-Science User Certificate X509v3 Subject Key Identifier: BF:00:02:4B:3A:45:A6:B8:EB:66:E4:F2:EE:CA:60:9D:B8:D1:B2:0D X509v3 Authority Key Identifier: keyid:02:38:AB:11:A3:96:80:8B:0D:D3:15:2B:08:A5:8E:30:DA:B2:DA:A8 DirName:/C=UK/O=eScience/OU=Authority/CN=CA/Email=ca-operator@grid-support.ac.uk serial:00 X509v3 Issuer Alternative Name: email:ca-operator@grid-support.ac.uk Netscape CA Revocation Url: http://ca.grid-support.ac.uk/cgi-bin/importCRL Netscape Revocation Url: http://ca.grid-support.ac.uk/cgi-bin/importCRL Netscape Renewal Url: http://ca.grid-support.ac.uk/cgi-bin/renewURL X509v3 CRL Distribution Points: URI:http://ca.grid-support.ac.uk/cgi-bin/importCRL Signature Algorithm: md5WithRSAEncryption 3a:1f:81:a8:1a:83:ff:2c:0f:7b:b6:1e:2a:87:31:13:d9:ca: 9e:c1:9e:e4:42:b5:22:56:7b:01:98:11:13:29:a3:d8:d2:37: 80:58:ac:7f:44:f7:1e:ba:00:f4:8b:c8:34:00:ff:44:27:c2: 2a:54:8b:95:e9:a0:00:f8:3d:60:92:c4:99:2b:72:d4:b7:dd: 78:bd:c9:4a:01:d7:14:1d:3c:d9:6f:60:7b:23:90:8e:d6:3a: 2d:45:39:5e:bc:fd:6d:77:7b:1e:cf:43:8c:e4:05:4c:1b:91: e5:bb:da:3d:cd:9d:05:6b:be:21:b0:e8:43:b2:4b:4e:c4:4f: 6b:4e:23:9e:03:d2:03:86:1b:44:68:60:41:5d:64:ae:2d:52: e2:7d:9b:99:60:71:7f:4a:00:1e:5d:9d:14:59:4f:4b:d7:9a: ee:e0:01:3d:87:36:16:bf:24:b3:84:fd:62:d1:d6:21:ae:3b: f7:e1:e5:52:ec:ef:68:f4:73:4f:1b:62:a6:f4:47:0b:6c:1e: 28:23:6b:25:d3:a1:f7:37:f6:55:d6:82:7c:49:a9:1d:71:57: e6:bc:74:71:94:0d:df:fc:21:63:16:54:c9:0f:51:1c:7a:bf: 5c:ef:7d:28:23:73:64:84:eb:f2:b6:52:89:ca:48:78:31:e8: dd:b9:91:3f -----BEGIN CERTIFICATE----- MIIFBDCCA+ygAwIBAgIBfzANBgkqhkiG9w0BAQQFADBwMQswCQYDVQQGEwJVSzER MA8GA1UEChMIZVNjaWVuY2UxEjAQBgNVBAsTCUF1dGhvcml0eTELMAkGA1UEAxMC Q0ExLTArBgkqhkiG9w0BCQEWHmNhLW9wZXJhdG9yQGdyaWQtc3VwcG9ydC5hYy51 azAeFw0wMjEwMzExNTUwNTlaFw0wMzEwMzExNTUwNTlaMFoxCzAJBgNVBAYTAlVL MREwDwYDVQQKEwhlU2NpZW5jZTETMBEGA1UECxMKTWFuY2hlc3RlcjELMAkGA1UE BxMCTUMxFjAUBgNVBAMTDW1pY2hhZWwgam9uZXMwgZ8wDQYJKoZIhvcNAQEBBQAD gY0AMIGJAoGBAMaW/Xrg+vHmQ53By3I44U5Ehtqniu2K/PNk2J69r858VTnNYXSo HW1gbmWR3CzCZID2+Ro8/tTSHFL6xkfqpk6Stckdk91IYVRAtReEP1xHSCkrg4LH 1q3TYF1tXfcIJRfSFOKOrzc75Dtj9zEktGZ4jgaTxo22/lB5Okr4WVg9AgMBAAGj ggJBMIICPTAJBgNVHRMEAjAAMBEGCWCGSAGG+EIBAQQEAwIFoDALBgNVHQ8EBAMC A+gwLAYJYIZIAYb4QgENBB8WHVVLIGUtU2NpZW5jZSBVc2VyIENlcnRpZmljYXRl MB0GA1UdDgQWBBS/AAJLOkWmuOtm5PLuymCduNGyDTCBmgYDVR0jBIGSMIGPgBQC OKsRo5aAiw3TFSsIpY4w2rLaqKF0pHIwcDELMAkGA1UEBhMCVUsxETAPBgNVBAoT CGVTY2llbmNlMRIwEAYDVQQLEwlBdXRob3JpdHkxCzAJBgNVBAMTAkNBMS0wKwYJ KoZIhvcNAQkBFh5jYS1vcGVyYXRvckBncmlkLXN1cHBvcnQuYWMudWuCAQAwKQYD VR0SBCIwIIEeY2Etb3BlcmF0b3JAZ3JpZC1zdXBwb3J0LmFjLnVrMD0GCWCGSAGG +EIBBAQwFi5odHRwOi8vY2EuZ3JpZC1zdXBwb3J0LmFjLnVrL2NnaS1iaW4vaW1w b3J0Q1JMMD0GCWCGSAGG+EIBAwQwFi5odHRwOi8vY2EuZ3JpZC1zdXBwb3J0LmFj LnVrL2NnaS1iaW4vaW1wb3J0Q1JMMDwGCWCGSAGG+EIBBwQvFi1odHRwOi8vY2Eu Z3JpZC1zdXBwb3J0LmFjLnVrL2NnaS1iaW4vcmVuZXdVUkwwPwYDVR0fBDgwNjA0 oDKgMIYuaHR0cDovL2NhLmdyaWQtc3VwcG9ydC5hYy51ay9jZ2ktYmluL2ltcG9y dENSTDANBgkqhkiG9w0BAQQFAAOCAQEAOh+BqBqD/ywPe7YeKocxE9nKnsGe5EK1 IlZ7AZgREymj2NI3gFisf0T3HroA9IvINAD/RCfCKlSLlemgAPg9YJLEmSty1Lfd eL3JSgHXFB082W9geyOQjtY6LUU5Xrz9bXd7Hs9DjOQFTBuR5bvaPc2dBWu+IbDo Q7JLTsRPa04jngPSA4YbRGhgQV1kri1S4n2bmWBxf0oAHl2dFFlPS9ea7uABPYc2 Fr8ks4T9YtHWIa479+HlUuzvaPRzTxtipvRHC2weKCNrJdOh9zf2VdaCfEmpHXFX 5rx0cZQN3/whYxZUyQ9RHHq/XO99KCNzZITr8rZSicpIeDHo3bmRPw== -----END CERTIFICATE----- • Version • Serial Number • Issuer • Times of Validity • Subject • Public Key • Extensions • Constraints • Type and Use • Thumbprint • CRL

  15. Authentication • Your identity is seen as being trusted if your public key has been signed by a Certificate Authority which you have decided to trust. • Result is that you are verified as being the entity identified in the Distinguished Name (DN) of the certificate, e.g.: • C=UK, O=eScience, OU=Manchester, L=MC, CN=jon maclaren • This DN will then be used in later Authorization stages. e-Science NorthWest

  16. Authorization • Still lots of ad-hoc techniques being used • Most common one is the “Grid mapfile”, which maps each DN to a single user name • Bad for many reasons: • You have to add each new user in (no scalability) • There are no standard mechanisms for distributing a new version of the file within a virtual organisation • Things tend to get out of step, so a user will get authorization failures at particular sites/services e-Science NorthWest

  17. Beyond the grid-mapfile • Best known alternative is the Community Authorization Scheme (CAS). • Authorization is deferred from a resource to an authorization server. • Server contains, in a single place, information on who is allowed to do what, and where, within the VO. • The introduction of a third-party is common to most sophisticated authorization schemes. • Other schemes include: • Akenti • PERMIS (David Chadwick from Salford) • Shibboleth (targeting HE sector, to replace ATHENS?) • All about expression of policy e-Science NorthWest

  18. 1. CAS request, with user/group CAS resource names membership Does the and operations collective policy resource/collective authorize this 2. CAS reply, with membership request for this capability and resource CA info user? collective policy information Resource 3. Resource request, authenticated with Is this request capability authorized by the local policy capability? information 4. Resource reply Is this request authorized for the CAS? Community Authorization(Prototype shown August 2001) User Intro to Grid Computing and Globus Toolkit™

  19. Community Authorization Service • CAS provides user community with information needed to authenticate resources • Sent with capability credential, used on connection with resource • Resource identity (DN), CA • This allows new resources/users (and their CAs) to be made available to a community through the CAS without action on the other user’s/resource’s part Intro to Grid Computing and Globus Toolkit™

  20. Authorization Scheme Taxonomy Authorization Push Sequence Authorization Pull Sequence AuthorizationAgentSequence e-Science NorthWest

  21. For those who want toknow (lots) more • “Conceptual Grid Authorization Framework and Classification” • GGF Informational (Draft) Document • By the GGF Working Group on Authorization Frameworks and Mechanisms (AuthZ-WG) • Document draft: http://tinyurl.com/6nuh6 • Public comment period over • Should be appearing in its final form at: • http://www.ggf.org/documents/final.htm e-Science NorthWest

  22. Accounting? • Much, much worse – almost all ad-hoc. • Most middleware, including GT2/GT3, only provide some logfiles. • There is a draft standard for recording resource usage, at least for computational jobs. • Developed by the GGF Usage Record Working Group (UR-WG) • Current drafts at: http://www.psc.edu/~lfm/Grid/UR-WG/ • PBS Scheduler can generate records in this format • Also, there is a group working on a service for holding this data. • The GGF OGSA Resource Usage Service Working Group (RUS-WG) • Was following OGSI technology, and is currently inactive • May publish a “pure web services” specification in 2005. e-Science NorthWest

  23. Anything else? • Most advanced work in this area is the requirements that have been gathered by the GGF Site Authentication Authorization and Accounting Research Group (SA3-RG or AAA-RG) • “Site Requirements for Grid Authentication, Authorization and Accounting” • GGF Informational (Final) Document • http://www.ggf.org/documents/GWD-I-E/GFD-I.032.txt • The main problem with accounting is that it’s seen as very, very dull... e-Science NorthWest

  24. Pre-XML Grids What is “Grid” anyway?

  25. Computationally intensive File access/transfer Bag of various heterogeneous protocols & toolkits Monolithic design Recognised internet, ignored Web Academic teams X.509, LDAP, FTP, … Globus Toolkit Condor, Unicore Defacto standards GridFTP, GSI Generation Game App-specific Services Open Grid Services Architecture Web services Increased functionality, standardization Data and knowledge intensive Open services-based architecture Builds on Web services GGF + OASIS+W3C Multiple implementations Global Grid Forum Industry participation Custom solutions Time (adapted from Ian Foster GGF7 Plenary) e-Science NorthWest

  26. What do I have to choose from? • Globus Toolkit • version 2 is widely deployed; nearest thing to a de facto standard • horizontally integrated bag of tools • suits grid application developers better than end users • UNICORE • less widely deployed; few UK deployments • vertically integrated • suits end users better than application developers • Condor • high throughput computing • great for cycle harvesting • Web Services? • wait or roll your own using Web Services tools • Others • yes, there are others e-Science NorthWest

  27. A p p l i c a t i o n s Diverse global services Core services Local OS Globus Toolkit version 2:An Overview • "Single sign-on" through Grid Security Infrastructure (GSI) • Remote execution of jobs • GRAM, job-managers, Resource Specification Language (RSL) • Grid-FTP • Efficient, reliable file transfer; third-party file transfers • MDS (Metacomputing Directory Service) • Resource discovery (GRIS and GIIS) • Co-allocation (DUROC) • Limited by support from scheduling infrastructure • Other GSI-enabled utilities • gsi-ssh, grid-cvs, etc. • Low-level APIs and command-line interfaces • Commodity Grid Kits (CoG-kits), Java, Perl, Python • Widespread deployment, lots of projects e-Science NorthWest

  28. The Globus Toolkit™:Security Services The Globus Project™ Argonne National LaboratoryUSC Information Sciences Institute http://www.globus.org

  29. Public Key Based Authentication • User sends certificate over the wire. • Other end sends user a challenge string. • User encodes the challenge string with private key • Possession of private key means you can authenticate as subject in certificate • Public key is used to decode the challenge. • If you can decode it, you know the subject • Treat your private key carefully!! • Private key is stored only in well-guarded places, and only in encrypted form Intro to Grid Computing and Globus Toolkit™

  30. X.509 Proxy Certificate • Defines how a short term, restricted credential can be created from a normal, long-term X.509 credential • A “proxy certificate” is a special type of X.509 certificate that is signed by the normal end entity cert, or by another proxy • Supports single sign-on & delegation through “impersonation” • Currently an IETF draft Intro to Grid Computing and Globus Toolkit™

  31. User Proxies • Minimize exposure of user’s private key • A temporary, X.509 proxy credential for use by our computations • We call this a user proxy certificate • Allows process to act on behalf of user • User-signed user proxy cert stored in local file • Created via “grid-proxy-init” command • Proxy’s private key is not encrypted • Rely on file system security, proxy certificate file must be readable only by the owner Intro to Grid Computing and Globus Toolkit™

  32. Delegation • Remote creation of a user proxy • Results in a new private key and X.509 proxy certificate, signed by the original key • Allows remote process to act on behalf of the user • Avoids sending passwords or private keys across the network Intro to Grid Computing and Globus Toolkit™

  33. Globus Security APIs • Generic Security Service (GSS) API • IETF standard • Provides functions for authentication, delegation, message protection • Decoupled from any particular communication method • But GSS-API is somewhat complicated, so we also provide the easier-to-use globus_gss_assist API. • GSI-enabled SASL is also provided Intro to Grid Computing and Globus Toolkit™

  34. GSI Applications • Globus Toolkit™ uses GSI for authentication • Many Grid tools, directly or indirectly, e.g. • Condor-G, SRB, MPICH-G2, Cactus, GDMP, … • Commercial and open source tools, e.g. • ssh, ftp, cvs, OpenLDAP, OpenAFS • SecureCRT (Win32 ssh client) • And since we use standard X.509 certificates, they can also be used for • Web access, LDAP server access, etc. Intro to Grid Computing and Globus Toolkit™

  35. Ongoing and Future GSI Work • Protection against compromised resources • Restricted delegation, smartcards • Standardization • Scalability in numbers of users & resources • Credential management • Online credential repositories (“MyProxy”) • Account management • Authorization • Policy languages • Community authorization Intro to Grid Computing and Globus Toolkit™

  36. Restricted Proxies • Q: How to restrict rights of delegated proxy to a subset of those associated with the issuer? • A: Embed restriction policy in proxy cert • Policy is evaluated by resource upon proxy use • Reduces rights available to the proxy to a subset of those held by the user • But how to avoid policy language wars? • Proxy cert just contains a container for a policy specification, without defining the language • Container = OID + blob • Can evolve policy languages over time Intro to Grid Computing and Globus Toolkit™

  37. Problems with GSI • There are problems with GSI, and uptake is not as widespread as the Globus Alliance would like you to believe: • Creates proxy certificate on remote machine • Proxy contains private key, unencrypted, stored in /tmp • Only protected with UNIX filesystem rules • These are time-limited. But people can set this limit to be very long. • So you have to trust the root user of any machine you use • Mostly, restricted proxies are created, but some applications, e.g. GSI-SSH create full proxies • It is easy to design attacks whereby the full private key can be stolen • In part, these problems are fundamental to generic delegation schemes, which are powerful, but also dangerous • Proxy certificates mean that people from some application domains simply will not use Globus e-Science NorthWest

  38. The Globus Toolkit™:Resource Management Services The Globus Project™ Argonne National LaboratoryUSC Information Sciences Institute http://www.globus.org

  39. The Challenge • Enabling secure, controlled remote access to heterogeneous computational resources and management of remote computation • Authentication and authorization • Resource discovery & characterization • Reservation and allocation • Computation monitoring and control • Addressed by new protocols & services • GRAM protocol as a basic building block • Resource brokering & co-allocation services • GSI for security, MDS for discovery Intro to Grid Computing and Globus Toolkit™

  40. Resource Management • The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity • Resource Specification Language (RSL) is used to communicate requirements • A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services • Integrated with Condor, PBS, MPICH-G2, … Intro to Grid Computing and Globus Toolkit™

  41. Broker Co-allocator Resource Management Architecture RSL specialization RSL Application Information Service Queries & Info Ground RSL Simple ground RSL Local resource managers GRAM GRAM GRAM LSF Condor NQE Intro to Grid Computing and Globus Toolkit™

  42. Resource Specification Language • Common notation for exchange of information between components • Syntax similar to MDS/LDAP filters • RSL provides two types of information: • Resource requirements: Machine type, number of nodes, memory, etc. • Job configuration: Directory, executable, args, environment • Globus Toolkit provides an API/SDK for manipulating RSL Intro to Grid Computing and Globus Toolkit™

  43. RSL Syntax • Elementary form: parenthesis clauses • (attribute op value [ value … ] ) • Operators Supported: • <, <=, =, >=, > , != • Some supported attributes: • executable, arguments, environment, stdin, stdout, stderr, resourceManagerContact,resourceManagerName • Unknown attributes are passed through • May be handled by subsequent tools Intro to Grid Computing and Globus Toolkit™

  44. Constraints: “&” • For example: & (count>=5) (count<=10) (max_time=240) (memory>=64) (executable=myprog) • “Create 5-10 instances of myprog, each on a machine with at least 64 MB memory that is available to me for 4 hours” Intro to Grid Computing and Globus Toolkit™

  45. Disjunction: “|” • For example: & (executable=myprog) ( | (&(count=5)(memory>=64)) (&(count=10)(memory>=32))) • Create 5 instances of myprog on a machine that has at least 64MB of memory, or 10 instances on a machine with at least 32MB of memory Intro to Grid Computing and Globus Toolkit™

  46. GRAM Protocol • GRAM-1: Simple HTTP-based RPC • Job request • Returns a “job contact”: Opaque string that can be passed between clients, for access to job • Job cancel, status, signal • Event notification (callbacks) for state changes • Pending, active, done, failed, suspended • GRAM-1.5 (U Wisconsin contribution) • Add reliability improvements • Once-and-only-once submission • Recoverable job manager service • Reliable termination detection • GRAM-2: Moving to Web Services (SOAP)… Intro to Grid Computing and Globus Toolkit™

  47. Globus Toolkit Implementation • Gatekeeper • Single point of entry • Authenticates user, maps to local security environment, runs service • In essence, a “secure inetd” • Job manager • A gatekeeper service • Layers on top of local resource management system (e.g., PBS, LSF, etc.) • Handles remote interaction with the job Intro to Grid Computing and Globus Toolkit™

  48. GRAM Components MDS client API calls to locate resources Client MDS: Grid Index Info Server Site boundary MDS client API calls to get resource info GRAM client API calls to request resource allocation and process creation. MDS: Grid Resource Info Server Query current status of resource GRAM client API state change callbacks Grid Security Infrastructure Local Resource Manager Allocate & create processes Request Job Manager Create Gatekeeper Process Parse Monitor & control Process RSL Library Process Intro to Grid Computing and Globus Toolkit™

  49. Co-allocation • Simultaneous allocation of a resource set • Handled via optimistic co-allocation based on free nodes or queue prediction • In the future, advance reservations will also be supported (already in prototype) • Globus APIs/SDKs support the co-allocation of specific multi-requests • Uses a Globus component called the Dynamically Updated Request OnlineCo-allocator (DUROC) Intro to Grid Computing and Globus Toolkit™

  50. Multirequest: “+” • A multirequest allows us to specify multiple resource needs, for example + (& (count=5)(memory>=64) (executable=p1)) (&(network=atm) (executable=p2)) • Execute 5 instances of p1 on a machine with at least 64M of memory • Execute p2 on a machine with an ATM connection • Multirequests are central to co-allocation Intro to Grid Computing and Globus Toolkit™

More Related