440 likes | 542 Views
Grid middleware services in CoreGRID. Norbert Meyer Poznań Supercomputing and Networking Center www.coregrid.net meyer@man.poznan.pl. AGENDA. Institute on Grid Information, Resource and Workflow Monitoring Services - presentation Integration of services:
E N D
Grid middleware services in CoreGRID Norbert Meyer Poznań Supercomputing and Networking Center www.coregrid.net meyer@man.poznan.pl
AGENDA • Institute on Grid Information, Resource and Workflow Monitoring Services - presentation • Integration of services: • User Management in multi-domain environment • Grid Checkpointing Architecture • Summary CSS’06, Bonn, Germany, July 24th, 2006
Institute on Grid Information, Resource and Workflow Monitoring Services - IRWM • Several services are necessary for establishing a complex GRID • architecture, including support for resource management and • scheduling system • All services must be designed to establish a fault-tolerant and flexible behaviour in a large-scale heterogeneous environment with multi-domains • Current models do not scale to GRID level or are focused on specific aspects. CSS’06, Bonn, Germany, July 24th, 2006
IRWM Objectives • Providing multi-grain and dynamic monitoring for Grid resources and services • Providing Grid resource management framework based on the monitoring infrastructure • Providing monitoring the progress of job workflows • Support for extraction and representation of job workflows from programming models • Realizing middleware support for complex job workflow execution • Investigating accounting services • User account management for production based Grids • Checkpoint restart functionality in Grid environments CSS’06, Bonn, Germany, July 24th, 2006
Middleware Services • Information and Monitoring Services • Checkpointing Services • Workflow Services • Accounting and User Management Services CSS’06, Bonn, Germany, July 24th, 2006
Participants • 15 partners from 10 countries: • universities • academy of science • HPC centres FhG FZJ UMUE UNI DO U. Cambridge U. Westminster PSNC Masaryk U. INRIA SZTAKI U. Coimbra UPC INFN U. Calabria ICS-FORTH U. Cyprus CSS’06, Bonn, Germany, July 24th, 2006
Monitoring Accounting &User Accounts Mgn Checkpointing Workflows Positioning vis à vis the NGG Reports • Properties: • Pervasive with mobility • self-managing • resilient • flexible (various types of infrastructure) • easy to program with a programming • interface reusing the existing software • flexible in trust • secure to assure confidence • The main goal of R&D to be done in the future is to have a dynamic and • reconfigurable environment, which would meet the requirements of industry • and scientific communities CSS’06, Bonn, Germany, July 24th, 2006
Information and Monitoring Services • Subtasks: • Establishing a definition for 'Grid performance'. Investigating the relationship of 'application performance' and 'infrastructure performance‘ • Elaborating Grid performance metrics • Designing a framework for performance steering and management • Scalability to a large number of resources and a multitude of events produced by these resources • Exploitation of monitoring data for fault and security management. • Integration of Passive and Active Monitoring for GRID infrastructure CSS’06, Bonn, Germany, July 24th, 2006
Checkpointing Services • Subtasks: • Analysis of the overall Grid architecture to gain knowledge allowing to locate checkpointing within the Grid context • Defining API interfaces to layers. • interfaces to higher and equivalent layers will be based on OGSA services • interfaces to lower layers will use some general techniques allowing to reference and access the existing and future, checkpointing systems • Definition of the Grid Checkpointing Architecture, integrated with the Grid architecture. • Integrating kernel and application level checkpointing systems CSS’06, Bonn, Germany, July 24th, 2006
Workflow Services • Subtasks: • Analysis of workflow approaches in Grid computing • Workflow description languages- coloured Petri nets for Grid workflows • Collaborative workflow-oriented portals related to collaborative workflow management • Fault-tolerant workflow manager service • Investigating the problem of workflow management over different Grid middleware systems • Identifying the appearance of transactions in Grid environments, the roles of transaction managers in Grids • Compatibility and conversion of different Grid workflow description languages CSS’06, Bonn, Germany, July 24th, 2006
Accounting and User Management Services • Subtasks: • Analysis of requirements (basing on the current state of development) • Definition of an interface for exchanging information between different user account management systems from different Grids • Attempt to standardize the above interface • Definition of accounting structure that will allow to store other and non-standard accounting data • Design of a model architecture for the accounting system that will comply with different resources from different grids and will take into account local and VOs policies • Authentication, authorization, job encapsulation, accounting and logging issues CSS’06, Bonn, Germany, July 24th, 2006
User Management in Virtual Organizations Based on work done by Jiři Denemark, Michał Jankowski, Ludek Matyska, Norbert Meyer, Miroslav Ruda, Paweł Wolniewicz CSS’06, Bonn, Germany, July 24th, 2006
Introduction • Aims of User Management • Provides controlled and secure access to Grid resources • Provides effective way of introducing/removing users and granting/revoking privileges • Accounting CSS’06, Bonn, Germany, July 24th, 2006
Virtual organization • Virtual organization (VO) is a set of individuals and/or institutions that allows its members sharing resources in a controlled manner, so that they may collaborate to achieve a shared goal • VOs may form hierarchies • The hierarchy forms a Directed Acyclic Graph (DAG) where the VOs are vertices and the edges represent relations between them • The user may be a member of many VOs CSS’06, Bonn, Germany, July 24th, 2006
User roles • The privileges the organization wants to grant the user, related to the tasks he is supposed to perform, are connected to user roles • The roles are defined across the hierarchy of VOs and are managed in independent structure • The authorities of VOs are responsible for defining roles • One user may have multiple roles and he is responsible to select the required role while accessing the resource CSS’06, Bonn, Germany, July 24th, 2006
Capabilities • Any special rights to resources expressed, e.g. by ACL are calledcapabilities • The capabilities may be used to express any rights to aspecific user, e.g. some file is writable only by the owner CSS’06, Bonn, Germany, July 24th, 2006
VOs, roles and capabilities CSS’06, Bonn, Germany, July 24th, 2006
Virtual environment • By the virtual environment we understand such encapsulation of user jobs that will both guarantee the limited set of privileges and also provides support for identification of user and organization on behalf he/she acts • Virtual accounts, sandboxes, and virtual machines are examples of different approaches to the creation of virtual environments CSS’06, Bonn, Germany, July 24th, 2006
Security -requirements • Authentication • single sign-on • credential delegation • integration with local security solutions • Fine grained authorization (maximum security for resources with minimum limitations to the users) • Membership to Virtual Organization • User role in VO • Capabilities • Combined security policies of VO and resource owner (delegation of some administrative privileges and work from node administrator to VO) • Possibility of logging user activities for audit CSS’06, Bonn, Germany, July 24th, 2006
Accounting -requirements • Proper level of job isolation • Context (VO, role...) • Collecting data from many locations CSS’06, Bonn, Germany, July 24th, 2006
Effective and scalable Management • The administrative burden must be divided between VO managers and resource managers • Avoid duplication of administrative work (e.g. creating user accounts on each node) • Take into account a dynamic structure of the Grid, lots of administrative domains, heterogenity, various local policies CSS’06, Bonn, Germany, July 24th, 2006
Examplesof approaches • Virtual User System – VUS • Perun • VOMS, LCAS, LCMAPS • Virtual Workspaces, DynamicVirtual Environments CSS’06, Bonn, Germany, July 24th, 2006
Virtual User System • VUS is an extension of the system that runs users' jobs (e.g. Globus GRAM) that allows running jobs without having an user account on a node. • The user is authenticated, authorized and then logged on a 'virtual' account (one user per one account at the time). • The history of user-account mapping is stored, so that accounting and tracking user activities is possible. • The authorization is pluginbased. Using VO-membership plugin it is possible to combine security policy of VO and resource owner. • VUS has been used or will be used in a number of national and international projects: SGIgrid, Clusterix, GridLab, Coregrid, BalticGrid. • Virtual environments other than virtual accounts are not supported. • Authorization based on roles or capabilities not supported, but easy to be added. • Non WS approach CSS’06, Bonn, Germany, July 24th, 2006
Perun • Central Configuration Database with resource configuration information. • Normalized data, integrity constraints enforced. • Changes in database watched by database triggers, data change starts automatic service update. • „Configuration files/database" of managed services are changed, no run-time dependency on Perun. • Support for service dependencies, application-specific plugins. • Failures detected, services re-planned eventually. • Deployed in projects: MetaCentre, SITOLA, GridLab.s • Accounting and logging not supported. • Limited control of the resource user. • Virtual environments not supported. CSS’06, Bonn, Germany, July 24th, 2006
Virtual Workspaces, Runtime Environments, DynamicVirtual Environments • Workspace Management Serviceallows to run user jobs in virtual environment, using different technologies (GRAM Gatekeeper, OGSI, WSRF) • The virtual environments are implemented as dynamically created Unix accounts and virtual machines • The authorization and accounting issues are not addressed directly CSS’06, Bonn, Germany, July 24th, 2006
Virtual Environment Management Service • webservice responsible for managing virtual environments • creating and destroying them • running jobs in virtual environments • Virtual Environment Information Service • collecting data on the virtual environments concerning time of creation and destruction • users mapped to the environment • accounting and logging information • available to different players on the scene • users, VO managers, resource owners, ... CSS’06, Bonn, Germany, July 24th, 2006
Virtual Environment Management Service CSS’06, Bonn, Germany, July 24th, 2006
Virtual Environment Management Service • Authentication and Authorization module • Performs authentication • may base on the existing solutions e.g. Globus GSI • The authorization is done by querying a set of authorization plugins • plugins for the most frequently used authorization mechanisms and services like grid-mapfile • Local policy compatible • Special authorization plugin is VE mapping • Virtual environment module • The virtual environment module is responsible for the creation, deletion and communication with virtual environments, implemented as • The module is pluggable • possible to integrate different existing solutions e.g. Virtual User System, Virtual Machine CSS’06, Bonn, Germany, July 24th, 2006
Virtual Environment Information Service CSS’06, Bonn, Germany, July 24th, 2006
Summary • The list of requirements for user management is long and may vary depending on the system • There is number of tools that provide for at least part of the mentioned requirements • The tools are used in many projects, although none of them fulfills all the requirements • The proposed solution is a framework, that allows combining these tools CSS’06, Bonn, Germany, July 24th, 2006
Grid Checkpointing Architecture Based on work done by G. Jankowski, R. Januszewski, R. Mikołajczak, N. Meyer, PSNC (Poland) J. Kovacs, MTA SZTAKI (Hungary)
The background • Checkpointing is a process of preserving the application state in a way that allows to continue the interrupted computing from the point of time when checkpoint was created • What is it used for? • migration • reliability and fault tolerance • scalability • load balancing • ... • Existing approaches (in the past): Cray (Unicos), SGI (Irix), some open source solutions for IA-32 CSS’06, Bonn, Germany, July 24th, 2006
The Background (cont.) • Three levels of checkpointing: • kernel, integrated library, application • Requirements: • Single Data Center • High Performance Computing • High Throughput Computing • Cluster Computing • Grid Computing • Experience (based on national R&D projects) • NASA, Leibniz, PSNC – only application level checkpointing • It is missing a general C/R service CSS’06, Bonn, Germany, July 24th, 2006
Solution • Why not to integrate all existing (and future) • checkpointers into grid chekpointing service? CSS’06, Bonn, Germany, July 24th, 2006
Grid Checkpointing Architecture • Integrates current and future existing solutions of C/R on different levels • Provides a service to Grid broker and/or single scheduler system • Better utilisation of ressources (hardware provider) • Supports user in daily work Set of services and interfaces allowing the legacy and future low-level checkpointing packages to be deployed in Grid environments CSS’06, Bonn, Germany, July 24th, 2006
Grid Checkpointing Architecture - Summary • Major features: • simple • flexible • open • scalable • Prototyping • SZTAKI developed a framework that by utilizing a third-party low-level checkpointer is able to checkpoint PVM applications. • Low-level checkpointers developed in PSNC: • psncLibCkpt (user-level, Solaris 8, UltraSparc) • AltixC/R (kernel-level, SGI ProPack, IA64) • psncC/R (kernel-level, Solaris 8 and 9, UltraSparc) CSS’06, Bonn, Germany, July 24th, 2006
Conclusions • Basic core middleware services • Able to serve different grid environments • computational grid, data grid • computing on demand • Scalable • The proposed solution is a framework, that allows combining several modules/tools/services • Towards production/invisible grid - The way is an evolution, not revolution CSS’06, Bonn, Germany, July 24th, 2006