230 likes | 243 Views
gLite Grid Services. Abderrahman El Kharrim elkharrim@cnrst.ma Joint EPIKH/EUMEDGRID Support event in Rabat Morocco, 30.05.2011. A genuine new concept in distributed computing Could bring radical changes in the way people do computing
E N D
gLite Grid Services Abderrahman El Kharrim elkharrim@cnrst.ma Joint EPIKH/EUMEDGRID Support event in Rabat Morocco, 30.05.2011
A genuine new concept in distributed computing Could bring radical changes in the way people do computing Share the computing power of many countries for your needs Descentralizedthe placement of the computing resources Basic concept is simple: I.Foster :“coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.“ End-user : I want to be able to use computing resources as I need I don’t care who owns resources, or where they are Have to be secure My programs have to run there NO centralized control of resources or users Grid overview Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Grid overview If the WEB is able to share information, the Grid is intended to share computing power and storage Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Grid overview - VOs Members of the grid can dynamically be organized into multiple virtual organizations (VOs) with common purposes. The resources shared among VOs may be data, special hardware, processing capability, Software and licenses. Members of a grid can be part of multiple VOs at the same time. Examples of VOs : eumed, biomed, atlas, … Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Grid overview - The big picture Workload Management Logging & Bookkeeping (WMS) User Interface (UI) File and ReplicaCatalogs Computing Element Storage Element Site X Information System (BDII) submit query discover services retrieve update credential publish state publish state submit query retrieve AuthorizationService (CE) (SE) Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
gLite overview • gLiteis a grid middleware providing services that sit between the user applications and the underlying computing and storage resources. • The grid middleware services: should • Find convenient places for the application to be run • Optimize the use of the resources • Organize efficient access to data • Deal with security • Run the job and monitor its progress • Recover from problems • Transfer the result back to the user Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
gLite overview - Terminology Computational Resource Physical machines on which users wants to run their programs and to store/access data files Job A computational task (a binary application or script) that a user wants to run on the Grid, and retrieve the results Job Submission It is the action of delegating the application to the Grid middleware for its execution. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
gLite overview - Services Access CLI API Security Information & Monitoring Authorization Auditing Information &Monitoring Application Monitoring Authentication Data Management Workload Management MetadataCatalog File & ReplicaCatalog JobProvenance PackageManager Accounting StorageElement DataMovement ComputingElement WorkloadManagement Site Proxy Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
gLite overview - Services/Clients • Authentication and authorization • User Interface • Workload Management system • Computing Element • Worker Node • Storage Element • LCG File Catalog • Information Systems Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Authentication is based on X.509 PKI infrastructure Certificate Authorities (CA) issue (long lived) certificates identifying individuals Commonly used in web browsers to authenticate to sites Trust between CAs and sites is established (offline) In order to reduce vulnerability, on the Grid user identification is done by using (short lived) proxies of their certificates Proxies can Be delegated to a service such that it can act on the user’s behalf Be stored in an external proxy store (MyProxy) Be renewed (in case they are about to expire) Include additional attributes Virtual Organization Membership Service (VOMS) is a service that keeps track of the members of a VO. Support MyProxy (stored proxies). VOMS Grants users authorization to access the resource at VO level Authentication/Authorization Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
User Interface Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011 Contains clients: • Job management • Data management • Access to Information System • Authentication Installation in user space (tarball) or rpm based
Workload Management System WMS: Resource brokering, workflow management, I/O data management Web Service interface: WMProxy Task Queue: keep non matched jobs Information SuperMarket: optimized cache of information system Match Maker: assigns jobs to resources according to user requirements Job submission & monitoring Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Computing Element Submission through the WMS WMS CREAM CREAM CREAM • Contains CREAM (Computing Resource Execution And Management) service. • CREAM canbe used by a Generic Client: an end-user interacting directly with the Computing Element, or by the Workload Manager, which submits a given job to an appropriate CE found by the matchmaking process. Direct Job Submission Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Computing Element • Job management through the WMS provides many benefits compared to direct job submission to the CE: • The WMS can manage multiple CEs, and is able to forward jobs to the one which better satisfies a set of requirements, which can be specified as part of the job description. • The WMS can be instructed to handle job failures: if a job aborts due to problems related to the execution host, the WMS can automatically resubmit it to a different CE. • The WMS provides a global job tracking facility using the LB service. • The WMS supports complex job types (job collections, job with dependencies) which can not be handled directly by the CEs. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Worker Node Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011 That’s where the jobs are being run Contains clients • Data management Has mechanism to install/manage VO specific software Installs as tarball or rpm based
Storage Element • The Storage Element is the service which allows a user or an application to store data for future retrieval. • To define a storage element, we need to know: • Storage Resource Manager (SMR). • Storage Resource Types. • Transfer Protocol. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Storage Element • Storage Resource Manager (SMR): • It is a middleware interface application that makes standard data management operations between SEs of different resource type transparent to user. • These data management operations include: • File transfer. • Space reservation. • Renaming of files. • File directory creation. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Storage Element • Storage Resource Types: • For relatively small SEs: • Disk-based storage implementation is employed together with disk pool manager, as the SRM. • For bigger SEs: • The mass storage system (MSS) is implemented with CASTOR (CERN Advanced STORage Manager) as the SRM. • For hybrids between disk pool storage and MSS, we have dCacheas the SRM. • Transfer Protocol: • To transfer of files in and out of the SE. • GlobusGridFTP mandatory. • Others if available (https, ftp, etc). Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
LFC File Catalog • LFC = LGC File Catalog • Keeping track of the location of files and organize them in a logical way so that they will be accessible from anywhere. • Resolves logical filenames (LFN) to physical location of files (URL understood by SRM) and storage elements. • The identification of files on the storage elements is done through the use of different identifiers: • Logical File Name (LFN) • Globally Unique Identifier (GUID) • Storage URL (SURL) (or Physical File Name (PFN) ) • Transport URL (TURL) • while LFNs and GUIDs are used for the identification of files, SURLs and TURLs provide the necessary information to access and retrieve the files Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Information System • What? • System to collect information on the state of resources. • Why? • To discover resources of the grid and their nature. • To check for health status of resources. • To provide data in order to manage the workload more efficiently. • How? • Monitoring and publishing fresh data on the state of resources. • Who? • User searching specific resources for their activity. • Workload Management System. • Other monitoring system Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Information Systems • The IS architecture used in gLite is Berkeley Database Information Index (BDII): • Stores information at VO level. • Site GIIS (Grid Index Information Server): • Stores information at site level. • GRIS(Grid Resource Information Server): • Stores information at resource level. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Information Systems Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011
Thank you.. Rabat, Joint EPIKH/EUMEDGRID Support Site Admin 30.05.2011