1 / 63

Architecture and Services of gLite Middleware

Architecture and Services of gLite Middleware. Dusan Vudragovic <dusan@scl.rs> Scientific Computing Laboratory Institute of Physics Belgrade Serbia. Introduction to High Performance and Grid Computing. Set of basic Grid services. Job submission/management File transfer (individual, queued)

Download Presentation

Architecture and Services of gLite Middleware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture and Services of gLite Middleware Dusan Vudragovic <dusan@scl.rs> Scientific Computing Laboratory Institute of Physics Belgrade Serbia Introduction to High Performance and Grid Computing

  2. Set of basic Grid services Job submission/management File transfer (individual, queued) Database access Data management (replication, metadata) Monitoring/Indexing system information Introduction to High Performance and Grid Computing

  3. Grid services Authentication (CA) Authorization (VOMS) Information System User Interface (UI) Computing Element (CE) Storage Element (SE) Workload Management System (WMS) Introduction to High Performance and Grid Computing

  4. K1 K2 Encryption Decryption M C M Authentication (1/10) Cryptography • To implement the security infrastructure, cryptography uses mathematical algorithms that provide important building blocks • Corresponding definitions for the above symbols: • Plaintext: M • Cyphertext: C • Encryption with key K1 : EK1(M) = C • Decryption with key K2 : DK2(C) = M • Algorithms • Symmetric: K1 = K2 • Asymmetric: K1 ≠ K2 Introduction to High Performance and Grid Computing

  5. Authentication (2/10) A B Hi! 3$r 3$r Hi! A B Hi! 3$r 3$r Hi! Cryptography :: Symmetric Algorithms • The same key is used for encryption and decryption (no public key, only secret keys available.) • Advantages • Fast • Disadvantages • Exchange of secret keys needed: • how to distribute the keys? • the number of keys is O(n2) • Examples: • DES • 3DES • AES Introduction to High Performance and Grid Computing

  6. A B Hi! 3$r 3$r Hi! A B public Hi! cy7 cy7 Hi! A keys B keys public private private Authentication (3/10) Introduction to High Performance and Grid Computing Cryptography :: Public Key Algorithms (Asymmetric) • Every user has two keys: one private (secret) and one public: • it is impossible to derive the private key from thepublic one • a message encrypted by one key can be decrypted only by the other one. • No exchange of private key is possible. • the sender cyphers using the public key of the receiver • the receiver decrypts using his own private key; • the number of keys is O(n). • Examples: RSA (1978)

  7. This is some message Digital Signature This is some message A keys = ? Digital Signature public private Authentication (4/10) A This is some message Hash(A) Digital Signature B Hash(B) Hash(A) Introduction to High Performance and Grid Computing Cryptography :: Digital Signature • A calculates the hash of the message (with a one-way hash function) • A encrypts the hash using his private key: the encrypted hash is the digital signature • A sends the signed message to B • B calculates the hash of the message and verifies it with A, decyphered with A’s public key • If two hashes equal: message wasn’t modified; A cannot repudiate it.

  8. Authentication (5/10) Digital Certificates • A’s digital signature is safe if: • A’s private key is not compromised • B knows A’s public key • How can B be sure that A’s public key is really A’s public key and not someone else’s? • A third party guarantees the correspondence between public key and owner’s identity. • Both A and B must trust this third party • Two models proposed to build trust: • X.509: hierarchical organization (used in Grid) • PGP: “web of trust” (person to person) Introduction to High Performance and Grid Computing

  9. Authentication (6/10) Certification Authorities • The “third party” is called Certification Authority (CA). • Responsibilities of CA: • Issue Digital Certificates (containing public key and owner’s identity) for users, programs and machines • Check identity and the personal data of the requestor • Registration Authorities (RAs) do the actual validation • Revoke certificates in case of a compromise • Renew certificates in case of expiration • Periodically publish a list of revoked certificates through web repository • Certificate Revocation Lists (CRL): contain all the revoked certificates • CA certificates are self-signed Introduction to High Performance and Grid Computing

  10. Authentication (7/10) Structure of a X.509 certificate Public key Subject: C=RS, O=AEGIS, OU=Institute of Physics Belgrade, CN=Dusan Vudragovic Issuer: C=RS, O=AEGIS, CN=AEGIS-CA Not before: Apr 6 14:08:33 2008 GMT Not after: Apr 6 14:08:33 2009 GMT Serial number: 95 (0 x 5F) CA Digital signature X.509 Certificates • An X.509 Certificate contains: • owner’s public key; • identity of the owner (DN); • info on the CA; • time of validity; • Serial number; • digital signature of the CA Introduction to High Performance and Grid Computing

  11. A’s certificate Verify CA signature Random phrase Encrypt with A.’ s private key Encrypted phrase Decrypt with A’s public key Compare with original phrase Authentication (8/10) A B The Grid Security Infrastructure (GSI) • Based on X.509 PKI: • every user/host/service has an X.509 certificate; • certificates are signed by trusted (by the local sites) CA’s; • every Grid transaction is mutually authenticated: • Ali sends his certificate; • B verifies signature in A’s certificate; • B sends A a challenge string; • A encrypts the challenge string with his private key; • A sends encrypted challenge to B • B uses A’s public key to decrypt the challenge. • B compares the decrypted stringwith the original challenge • If they match, B verifies A’s identity and A can not repudiate it. Introduction to High Performance and Grid Computing

  12. User certificate file User Proxy certificate file Pass Phrase Private Key (Encrypted) Authentication (9/10) X.509 Proxy Certificate • Proxy: GSI extension to X.509 Identity Certificates • signed by the normal end entity cert (or by another proxy). • It enables single sign-on. • It supports some important features: • Delegation • Mutual authentication • It has a limited lifetime (minimized risk of “compromised credentials”) • User enters pass phrase, which is used to decrypt private key • Private key is used to sign a proxy certificate with its own, new public/private key pair. Introduction to High Performance and Grid Computing

  13. Authentication (10/10) Delegation • Delegation = remote creation of a (second level) proxy credential • New key pair generated remotely on server • Client signs proxy cert and returns it • Allows remote process to authenticate on behalf of the user Introduction to High Performance and Grid Computing

  14. No Cross- Domain Trust Trust Mismatch Mechanism Mismatch Authorization (1/7) Certification Certification Authority Authority Domain B Domain A Policy Policy Authority Authority Task Server Y Server X Sub-Domain A1 Sub-Domain B1 Introduction to High Performance and Grid Computing Multi-institution issues

  15. Certification Authority Authority Policy Policy Authority Authority Sub-Domain B1 Sub-Domain A1 Domain B Task Server X Server Y Authorization (2/7) No Cross- Domain Trust Certification Domain A Federation Service GSI Virtual Organization Domain Introduction to High Performance and Grid Computing Grid solution: use of VOs

  16. Authorization (3/7) ComputingCenter Service Rights VO ComputingCenter Use delegation to establish dynamic distributed system Introduction to High Performance and Grid Computing

  17. Authorization (4/7) VOMS server • Virtual organizations (VOs) are groups of Grid users (authenticated through digital certificates) • VO Management Service (VOMS) serves as a central repository for user authorization information, providing support for sorting users into a general group hierarchy, keeping track of their roles,etc. • VO Manager, according to VO policies and rules, authorizes authenticated users to become VO members • Resource centers (RCs) may support one or more VOs, and this is how users are authorized to use computing, storage and other Grid resources • VOMS allows flexible approach to A&A on the Grid Introduction to High Performance and Grid Computing

  18. Authorization (5/7) VOMS Ingredients • Attribute Certificates: AC is a PKI container, defined in RFC 3281, capable of containing a set of attributes tied to a specific identity. It is the system used by VOMS to issue its attributes. • VOMS groups: /aegis/scl • VOMS roles: /Role=VO-Admin • Roles can be defined for groups as well • FQAN (Fully Qualified Attribute Name) is a compact way to represent user’s membership in a group, along with its role holdership, if any • Syntax: <groupname>/Role=<rolename>/Capability=NULL where the /Capability=NULL may be omitted, since it refers to a deprecated feature of VOMS • /aegis/scl/Role=NULL/Capability=NULL Introduction to High Performance and Grid Computing

  19. Authorization (6/7) Attribute Certificate • FQAN are included in an Attribute Certificate • Attribute Certificates are used to bind a set of attributes (like membership, roles, authorization info etc) with an identity • ACs are digitally signed • VOMS uses AC to include the attributes of a user in a proxy certificat Introduction to High Performance and Grid Computing

  20. VOMS Server GSI voms-proxy-init VOMS Core Service (vomsd) Authorization Database VOMS Admin Service SOAP + SSL Admin Service SOAP voms-admin CLI Web User Interface HTTPS Web browser Authorization (7/7) VOMS Architecture Introduction to High Performance and Grid Computing

  21. Information System (1/10) Collect information of grid resources • Discovering new added resources • Monitoring load and health status Publish these information • Periodically updated • Well know data model Used by • Users searching a concrete resource • WMS allocating and managing jobs • Other monitoring services Basic data model • Grid Laboratory Uniform Environment (GLUE) Schema. Two architectures in glite3 • gLite Information System (BDII) • BDII over Globus MDS (Monitoring and Discovery System). • OpenLDAP interface. • Relational Grid Monitoring Architecture (R-GMA) • Based on the GMA (Grid Monitoring Architecture) standard from the Grid Global Forum • Information in SQL relational databases • Web Services. Introduction to High Performance and Grid Computing

  22. Information System (2/10) GLUE Schema :: Overview • A schema of objects and attributes describing Grid resources and its relationships. • Originally a EU-DataTAG and US-iVDGL coordinated effort. • Current participants: EGEE, OSG, Globus and NorduGrid. • A way to describe Grid info • Statically and dynamically supplied • Hierarchically represented • Independently of the framework (LDAP, XML, SQL…) Introduction to High Performance and Grid Computing

  23. Information System (3/10) GLUE Schema :: Site Element Introduction to High Performance and Grid Computing

  24. Information System (4/10) GLUE Schema :: Cluster Element Introduction to High Performance and Grid Computing

  25. Information System (5/10) GLUE Schema :: Computing Element Introduction to High Performance and Grid Computing

  26. Information System (6/10) gLite Information System Levels • Resource level: Grid Resource Information Server (GRIS) • One GRIS on top of each CE, SE, WMS, MyProxy (no WNs). • Sensors and scripts get status of concrete resources statically (e.g. GlueCEUniqueID) or dynamically (e.g. GlueCEStateWaitingJobs) • Site level: Grid Index Information Server (GIIS) • Compiles all the information of the different GRISes in a site. • gLite recommends using a BDII instead of a GIIS • Improves robustness and stability. • Called the site BDII. • Top level: Berkeley DB Information Index (BDII) • Keeps all Grid information about the VOs (generally only one). • Stores information from local BDIIs or GIISes in its database. • Only queries sites that are included in a configuration file. Introduction to High Performance and Grid Computing

  27. Information System (7/10) Introduction to High Performance and Grid Computing

  28. Information System (8/10) LDAP Model • Way of collecting info • Pull model (higher level servers periodically query lower level servers) • All servers are based on LDAP • Inherit hierarchical structure (tree-like) • LDAP Data Information Format (LDIF) • Users get info with • Generic applications • ldapsearch (BDII:2170 ports) • Graphical UIs (BDII web; LDAP GUIs) • Always can get information about specific resources (maybe more up-to-date) by querying directly the site BDIIs, GIISes or GRISes. • Querying VO info with lcg-infosites or lcg-info tools Introduction to High Performance and Grid Computing

  29. Information System (9/10) R-GMA Overview • Added from EDG Project • Based on the GMA standard from the GGF • Information in SQL relational databases (a DB per VO) • Query syntax is a SQL subset • Simple consumer-producer model • Web Services oriented • CLI and Web user interface • Allows self-logging applications • R-GMA offers a global view of the VO information • In one large relational DB: virtual database. • Registry stores localization tuples (database rows) published by producers: • Standard Tables: CE state in GLUE Schema (by R-GMA-GIN) • Applications specific tables (e.g. self-logging with Log4j) • Access by SQL queries through a WS interface. Introduction to High Performance and Grid Computing

  30. Information System (10/10) Introduction to High Performance and Grid Computing

  31. User Interface (UI) UI is the user’s interface to the Grid - Command-line interface to • Attribute/Proxy certificate • Job operations • To submit a job • Monitor its status • Retrieve output • Data operations • Upload file to SE • Create replica • Discover replicas • Other grid services To run a job user creates a JDL (Job Description Language) file Introduction to High Performance and Grid Computing

  32. Computing Element (CE) A CE is a grid batch queuewith a “grid gate” front-end: Information system Job request L&B Logging Loc. Info system Gatekeeper A&A Grid gate node Local resource management system:Condor / PBS / LSF master Homogeneous set of worker nodes (WNs) Introduction to High Performance and Grid Computing

  33. A&A Storage Element (SE) File transfer Requests L&B GridFTP EventLogging Gatekeeper Info system Loc. InfoSystem Disk arrays or tapes Storage elements hold files: write once, read many Replica files can be held on different SE: • “close” to CE; share load on SE File Catalogue - what replicas exist for a file and where are they? Introduction to High Performance and Grid Computing

  34. Workload Management System Introduction to High Performance and Grid Computing

  35. Job management requests (submission, cancellation) expressed via a Job Description Language (JDL) Introduction to High Performance and Grid Computing

  36. Keeps submission Requests Requests are kept for a while, waiting for being dispatched If there is no matching resource available Introduction to High Performance and Grid Computing

  37. Repository of resource information Updated via notifications and/or active polling on sources Provide matchmaker With information to decide best resources for request. Introduction to High Performance and Grid Computing

  38. Finds an appropriate CE or resource for job request according to the information from ISM. Taking into account job preferences, resource status, policies on resources Introduction to High Performance and Grid Computing

  39. Performs the actual job submission and monitoring Normally it is Condor. Introduction to High Performance and Grid Computing

  40. Computing Element is the place where you jobs run Introduction to High Performance and Grid Computing

  41. Workload Manager Proxy WMProxy • Provides access to WMS functionality through a Web Services based interface • Each job submitted to a WMProxy Service is given the delegated credentials of the user who submitted it. • These credentials can then be used to perform operations requiring interactions with other services • WMProxy advantages: • web service, SOAP • job collections, DAG jobs, shared and compressed • sandboxes • WMProxy caveats: • needs delegated credentials • Delegate once,submit many Introduction to High Performance and Grid Computing

  42. Workload Manager (WM) • Is responsible for • Calls Matchmaker to find the resource which best matches the job requirements. • Interacting with Information System and File catalog. • Calculates the ranking of all the matchmaked resource Information Supermarket (ISM) • is responsible for • basically consists of a repository of resource information that is available in read only mode to the matchmaking engine Job Adapter • is responsible for • making the final touches to the JDL expression for a job, before it is passed to CondorC for the actual submission • creating the job wrapper script that creates the appropriate execution environment in the CE worker node • transfer of the input and of the output sandboxes Introduction to High Performance and Grid Computing

  43. Job Controller (JC) • Is responsible for • Converts the condor submit file into ClassAd • hands over the job to CondorC Condor • responsible for • performing the actual job management operations: job submission, removal Log Monitor • is responsible for • watching the Condor log file • intercepting interesting events concerning active jobs • events affecting the job state machine • triggering appropriate actions. Introduction to High Performance and Grid Computing

  44. Task Queue • Gives the possibility to keep track of the requests if no resources are immediatelly avalaible • Non-matching requests will be retried periodically (eager scheduling) • Or wait for notification of avalaible resources (lazy scheduling) Introduction to High Performance and Grid Computing

  45. Computing Element is built on a homogeneous farm of computing nodes (calledWorker Nodes) Also there are many components inside CE such as gatekeeper, globus-jobmanager, .. Introduction to High Performance and Grid Computing

  46. Gatekeeper Grants access to the CE and map grid user to a local user id. Introduction to High Performance and Grid Computing

  47. Batch System A cluster of compute nodes controlled by a head node. handles the job execution Example: Torque (Open PBS), PBS Introduction to High Performance and Grid Computing

  48. Location of files UI LFC Network Daemon Characteristics of resources Workload Manager Inform. Service Job Contr. - CondorG CE characts & status SE characts & status WMS Computing Element Storage Element Introduction to High Performance and Grid Computing

  49. Daemon responsible for accepting incoming requests submitted waiting UI LFC Network Daemon JDL Input Sandbox files Workload Manager Inform. Service RB storage glite-wms-job-submit myjob.jdl Job Contr. - CondorG CE characts & status SE characts & status WMS Computing Element Storage Element Introduction to High Performance and Grid Computing

  50. submitted waiting UI Job Status LFC Network Daemon Job Workload Manager Inform. Service RB storage WM: responsible to take the appropriate actions to satisfy the request Job Contr. - CondorG CE characts & status SE characts & status WMS Computing Element Storage Element Introduction to High Performance and Grid Computing

More Related