830 likes | 1.06k Views
Peer-to-Peer and Grid Systems Grid computing. Michael Welzl http://www.welzl.at DPS NSG Team http://dps.uibk.ac.at/nsg Institute of Computer Science University of Innsbruck, Austria. TU Darmstadt Darmstadt, Germany 2-4 January 2007. Cluster with dedicated network links. Massively
E N D
Peer-to-Peer and Grid SystemsGrid computing Michael Welzl http://www.welzl.atDPS NSG Team http://dps.uibk.ac.at/nsg Institute of Computer Science University of Innsbruck, Austria TU Darmstadt Darmstadt, Germany 2-4 January 2007
Cluster with dedicatednetwork links Massively parallel systems Systems with distributed memory Heterogeneous distributed systems Systems with Shared memory SETI@home GRID Complexity • Grid poses difficult problems • Heterogeneity and dynamicity of resources • Secure access to resources with different users in various roles,belonging to VTs which belong to VOs • “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” • Efficient assignment of data and tasks to machines (“scheduling“)
Grid requirements • Computer scientists can tackle these problems • Grid application users and programmers are often not computer scientists • Important goal: ease of use • Programmer should not worry (too much) about the Grid • User should worry even less • Ultimate goal: write and use an application as if using a single computer(power grid metaphor) • How do computer scientists simplify? • Abstraction. • We build layers. • In a Grid, we typically have Middleware.
Grid computing without middleware • Example manual Grid application execution • scp code to 10 machines • log in to the 10 machines via ssh and start “application > result“ everywhere • Estimate running time, or let application tell you that it‘s done(e.g. via TCP/IP communication in app code) • retrieve result files via scp • Tedious process - so write a script file • Do this again for every application / environment? • What if your colleagues need something similar? • Standards needed, tools introduced
Toolkits • Most famous: Globus Toolkit • Evolution from GT2 via GT3 to GT4 influenced the whole Grid community • Reference implementation of Open Grid Forum (OGF) standards • Other well-known examples • Condor • Exists since mid-1980‘s • No Grid back then - system gradually evolved towards it • Traditional goal: harvest CPU power of normal user workstations many Grid issues always had to be addressed anyway • Special interfaces now enable Condor-Globus communication (“Condor-G“) • Unicore (used in D-Grid) • gLite (used in EGEE) • Issues that these middlewares (should) address • Load Balancing, error management • Authentification, Authorization and Accounting (AAA) • Resource discovery, naming • Resource access and monitoring • Resource reservation and QoS management
How the tools are applied in practice ComputeServer SimulationTool ComputeServer WebBrowser WebPortal RegistrationService Camera TelepresenceMonitor DataViewerTool Camera Database service ChatTool DataCatalog Database service CredentialRepository Database service Certificate authority Users work with client applications Application services organize VOs & enable access to other services Collective services aggregate &/or virtualize resources Resources implement standard access & management interfaces Source: Globus presentation by Ian Foster
Example Example: Globus Toolkit version 4 (GT4) Core Contrib/Preview Grid Telecontrol Protocol Depre-cated Delegation Data Replication Community Scheduling Framework WebMDS Python WS Core CommunityAuthorization Data Access & Integration Workspace Management Trigger C WS Core Web ServicesComponents Authentication Authorization Reliable File Transfer Grid Resource Allocation & Management Index Java WS Core Pre-WS Authentication Authorization GridFTP Pre-WS Grid Resource Alloc. & Mgmt Pre-WSMonitoring & Discovery C Common Libraries Non-WS Components Credential Mgmt Replica Location eXtensible IO (XIO) Security Data Mgmt Execution Mgmt Info Services CommonRuntime Source: Globus presentation by Ian Foster
Grid Resource Allocation Manager (GRAM) • Globus tool for job execution • Unified, resource independent replacement for steps in “manual Grid“ example • Unified way to set environment variables:Resource Specification Language (RSL) (stdout = x, arguments = y, ..) • Steps 1-4 become • Blocking:“globus-job-run -stage hostname applicationname“ • -stage option copies code to remote machine • Different architectures: recompilation needed – but not supported! • Nonblocking: scp code, then “globus-job-submit hostname applicationname“(staging not yet supported) • Obtain unique URL, continuously use it to query job status • When done, use “globus-job-get-output URL stdout“ to retrieve stdout • More complex systems are built on top of GRAM • E.g. Message Passing Interface (MPI) for the Grid: MPICH-G2
GRAM /2 • GRAM leaves a lot of questions unanswered • How to recompile application for different architectures?(automatically + in a unified way) • What if your computer‘s IP address changes? • What if the 10 accessed computer‘s IP addresses change? • What if two of the computers becomes unavailable? • What if 3 other users start to work with 5 of the 10 computers? • A tool for each problem... • General-purpose Architecture for Reservation and Allocation (GARA)Integrated QoS via “advance reservation“ of resources (CPU, Disk, Network) • Monitoring and Discovery System (MDS) for locating and monitoring resources • Resource Broker (Globus: do it yourself; Condor: “matchmaker“) translates requirement specification (CPU, memory, ..) into IP address • Diversity of complex tools standardized + available in Globus,addressing some but not all of the issues need for an architecture
Open Grid Services Architecture (OGSA) Domain-specific Services • OGSA supports: • Creation, • Maintenance, and • Application of services • OGSA is based on a layered architecture • Two core components: • OGSI provides base infrastructure • OGSA Platform provides core services • Policy, logging, etc. • Applications built on top of platform services OGSA Platform Services Open Grid Service Infrastructure (OGSI) Hosting Environment Protocol
Resource vs. Service • Resource is something sharable • Resource has several interfaces for accessing it • Service realizes the interface • With all message binding etc. for use of the client Service Interface Resource Interface Client Service Interface
OGSA Goals • Identify use cases for OGSA platform components • Identify and define OGSA platform components • Define hosting and platform-specific bindings • Define interoperable resource models • Standardization in Open Grid Forum (OGF) • formerly Global Grid Forum (GGF) • modelled after IETF • Lots of smaller goals under the above large goals: • Distributed resource management, seamless QoS, management solutions, open interfaces, integration of existing solutions(e.g. web services), …
OGSA Meta-OS Services OGSA Domain Services CMM Logging Policy Data Access Integration Provisioning Service Collection OGSI Notification HandleMapper Factory Manageability Registry Lifecycle Discovery OGSA Core Components Domain-specific Services • Core: OGSA Platform and OGSI • Independent of layers above and below Hosting Platform (= Native OS, …)
OGSI • OGSI layer is based on web services • Services called grid services • Every grid service is also a web service • Converse is not necessarily true • OGSI specifies • How grid service instances are named • Common interfaces and behavior • How to specify additional interfaces • Focus is on message-level interoperability • Common XML format
Grid Services vs. Web Services • Web services already widely used and well specified • WSDL for descriptions, XML for message formats • Web services typically stateless • Grid services typically long-running processes with state • Two kinds of stateful services • Service maintains state about itself • In grid services, service state = state of the resource • Interaction pattern between client and service is stateful • OGSI tackles the first problem • Need to separate physical state from logical state • Logical state maintained by service • Service might not contain all of physical state
OGSI Model OGSI • Grid services layered on web services • Grid services contain state • Both communicate with XML messages • Grid services described in GWSDL • Extension of WSDL • Client programming is the same • Transport selected at runtime • HTTP, SOAP, ... WS + OGSI Grid service State Service data GWSDL WSDL Client Web service Web service WS XML Messages XML
Terminology • Web service • Software component identified by a URI and with an XML interface • Stateful web service • Web service that maintains state between interactions • Grid service • Stateful web service that conforms to OGSI • Grid service description • Description of interface and behavior of a grid service • Defined in combination of WSDL and GWSDL • Grid service instance • Instance of a grid service, identified with a grid service handle (GSH) • Grid service references • Description of a grid service endpoint • Service data element • Publicly accessible state of a service
Common Management Model (CMM) • CMM is an abstract representation of real resources • Key terminology: • Manageable resource • Entity that has some state to which management operations can be applied • For example, hardware, software, ... • Manageability • Concept where resource defines information that can be used to manage it • Management • Actual process of managing a resource, monitoring, modifying, making decisions, ...
CMM GSH • Each management resource is a grid service • Manageability Interface • Interfaces and behavior common to all CMM services • Domain-specific Interface • Additional, resource- and domain-specific interfaces for managing the resource • Three aspects of manageability • XML schema for modeling the resource • Collection of manageability portTypes • Guidelines for modeling the resource Manageability Interface Domain-specific Interface Grid Service Facade to Managed Resource Resource
Resource Modeling • Resources defined with CMM typically coarse-grained services • Services are self-contained, few relationships to other services • Resource lifecycle modeling • Lifecycle is a set of states that a resource can have • How to define a generic lifecycle model? • CMM defines a lifecycle model with 5 states • Down • Starting • Up • Stopping • Failed
Policy • No clear definition; OGSA defines policy as “Definitive goal, course, or method of action based on a set of conditions to guide and determine present and future decisions” • Policies are implemented and used in context • For example, security, workload, networking, etc. policies • OGSA defines a framework for policies • OGSA policy model is a collection of rules based on conditions and actions • In general, policy is: if <condition> then <action> • For example: if <customer is boss> then <provide best QoS>
Levels of Policy Business Level • Policies can be defined on several levels • Business level • Typically SLA between institutions • Domain level • Canonical form defined by OGSA policy framework • Device level • Device-specific format • Used at enforcement points Domain Level Device Level
Policy Framework Policy Service Core Producer of Policies • OGSA Policy Service Core Policy Admin Policy Tools Policy Autonomic Manager Canonical policies Policy Service Manager Policy Transformation Service Repository Service Policy Validation Service Consumer of Policies Canonical policies Policy Enforcement Point Policy Service Agent Policy Resolution Service Non-canonical (changes to resource) Common Management Model
Policy Manager Responsible for controlling access to policy repository Expects policies in canonical format Policy Repository Stores policy documents Provides abstract interface Any storage, accessed through Data Access Interface Services Policy Enforcement Points Execute policy Work with Policy Service Agent to resolve conflicts Policy Service Agent Policy decision maker agents Policy Transformation Service Transforms business objectives and canonical policies to device-level configurations Policy Validation Service Validate policy changes Policy Resolution Service Evaluate policies in context of business-level SLA Policy Tools and Autonomic Managers Creation of policy documents Policy Framework Components
Logging • OGSA defines distributed logging system • Provides facilities for • Decoupling of log producers and consumers • Transforming logs to a common representation • Uses XML and XSLT • Filtering and aggregation • Configurable persistency • Consumption patterns • Secure logging • Requirements result in a publish/subscribe-type logging system
Data Access and Replication • Data Access and Integration Services (DAIS) • DAIS group is working on a common data management and interface solution to data access • OGSA requirements on data management: • Data access service • Uniform access regardless of heterogeneity • Data replication • Allows local copies of data for improved performance • Data caching service • Metadata catalog and services • Schema transformation services • Storage services
Conceptual Model • Two kinds of resources • Resources external to OGSI • OGSI-compliant logical counterparts of the above DRM EDR EDRM EDR DR EDRM EDRM
Conceptual Model • External Data Resource Manager (EDRM) and Data Resource Manager (DRM) • Data management system, e.g., file system • DRM represents EDRM • External Data Resource (EDR) and Data Resource (DR) • EDR is data managed by EDRM, e.g. directory in a file system • DR is a grid service that binds to EDR • DR is the contact point to data and exposes metadata
Why Grid Security is hard • Resources may be valuable and the problems sensitive • Resources often located in distinct administrative domains • Each resource has its own policies and procedures • Set of resources used by a single computation may be large, dynamic, and unpredictable • Not just client/server • Requires delegation? • Security must be broadly available and applicable • Standard, well-tested, well-understood protocols • Integrated with wide variety of tools
User view Easy to use Single sign-on Run applications ftp,ssh,Web,… User based trust model Proxies/agents (delegation) Resource owner view Specify local access control Auditing, accounting, etc. Integration w/ local system Kerberos, AFS, license mgr. Protection from compromisedresources Developer view API/SDK with authentication, flexible message protection, Flexible communication, delegation, ... Direct calls to various security functions Security integrated into higher-level SDKs How to meet all these requirements? Grid Security Requirements
Grid Security Infrastructure (GSI) • Extensions to standard protocols & APIs • Standards: SSL/TLS, X.509 & CA, GSS-API • Extensions for single sign-on and delegation • Globus Toolkit reference implementation of GSI • SSLeay/OpenSSL + GSS-API + SSO/delegation • Tools and services to interface to local security • Tools for credential management • Login, logout, etc. • Smartcards • MyProxy: Web portal login and delegation • K5cert: Automatic X.509 certificate creation
Community Authorization Service • Question: How does a large community grant its users access to a large set of resources? • Should minimize burden on both the users and resource providers • Community Authorization Service (CAS) • Community negotiates access to resources • Resource outsources some authorization to CAS • CAS handles user registration, group membership… • User who wants access to resource asks CAS for a capability credential • Resources can also do local access control
Security Summary • GSI successfully addresses wide variety of Grid security issues • Broad acceptance, deployment, integration with tools • Standardization ongoing in OGF • Community Authorization Service to address community-based allocation of resources • Continuing development
GT3 GT4 Architectural evolution of the Grid • OGSI / OGSA: Open Grid Service Infrastructure / Architecture • OGSI failed: too complex, not compliant with Web Service standards • There are different (standards compliant) ways to achieve the same • Move to WSRF is a general move towards service orientation (SOA) Source: Globus presentation by Ian Foster
Resource Virtualisation Handle heterogeneous and low-level resources in a uniform and high-level manner using Web technologies Service Orientation Resource Orientation
Compute Resource Virtualisation Web Service-based GRAM PBS Posix Fork ssh
Service Oriented Architecture Service Registry UDDI UDDI Register Find WSDL Service Contract Service Consumer Service Provider Bind Client Service SOAP
Web Services Standards UDDI Registry WSDL points to description describes service points to service finds service Service Consumer Web Service SOAP communicates with XML messages
Web Services and the Grid Standards are important Stability Interoperability Tool support Implementation re-use Web standards are present world wide But very few standards in Web Services Young technology SOAP, WSDL, UDDI Are these three standards enough to build a service-oriented Grid infrastructure?
Business Applications Web services are designed for business applications Short lifetime State is not really needed Reliability is first priority Statelessness ensures easy recovery upon crashes Transaction-based Persistent Never shut-down E.g. Amazon, google Lifetime is not an issue
Jobs are typical example of Grid resources One implementation Multiple instances that start and end Resources are transient Limited lifetime (start and end times, not persistent) Web Services implementations support multiple and shared instances, but this is not portable Resources have state Queued Running Terminated Failed Grid Resources
Web Services Lifetime Management Web services are persistent Lifetime management supported by different Web services implementations, but This is not standardised Implementation specific No portablility and no interoperability Sample Web services lifecycle events Deployment, Creation, Execution, Destruction, Undeployment, Failure All have different names and APIs Axis WSDP ETTK Systinet …
Web Services State A Web service is a remote procedure call over the Internet using XML messages No words about state Stateless service implements message exchanges with no access or use of information not contained in the input message Cannot remember information Serve don’t care requests E.g., compress and uncompress a file Requirements Manage internal data and attributes across multiple Invocations Clients Other dependent services
Web Services Resource Framework Domain-Specific Applications / Services Job Management WS-GRAM Data Services GridFTP OGF OGSA Core Services Web Services OASIS WS-Resource Framework W3C
WSRF Scope Models stateful resources Using existing Web services standards (unmodified SOAP & WSDL) Defines new thin standards WS-Resource Properties WS-Resource Lifetime WS-Notification ModelingStateful Resources with Web Services WS-Base Faults WS-Service Group WS-RenewableReferences
Web Service Standards Service Composition BPEL4WS WS-Notification WS-Service Group Quality of Experience (QoX) WS-Reliable Messaging WS-Transaction WS-Security WS-Resource Lifetime Description WS-Base Faults WS-Resource Properties WSDL XSD WS-Policy WS-Metadata Exchange Messaging SOAP XML WS-Addressing WS-Renewable References Transports JMS RMI / IIOP SMTP HTTP/HTTPS
WSRF Standards WS-Addressing Reference and identification of stateful resources in a Web services context WS-ResourceProperties Modeling of state as an XML document. Accessing state as WSDL defined interfaces WS-ResourceLifetime Management of leases on resource access Create, destroy, expire WS-ServiceGroups Creating and managing aggregations of Web services WS-BaseFaults Baseline for extensible fault framework Ability to reproduce exception hierarchies, as in Java