310 likes | 419 Views
Resource Brokering: the EuroGrid/GRIP approach. Donal Fellows, John Brooke, Jon MacLaren E-Science NorthWest @ University of Manchester UK http://www.esnw.ac.uk. Grid Interoperability.
E N D
Resource Brokering: the EuroGrid/GRIP approach Donal Fellows, John Brooke, Jon MacLaren E-Science NorthWest @ University of Manchester UK http://www.esnw.ac.uk
Grid Interoperability • In European and Japanese Grid projects there are two major middleware systems deployed, Globus (US) and Unicore (Europe/Japan). • Globus is mainly deployed in cluster-based Grids and Unicore in projects with complex heterogeneous architectures (e.g. specialist HPC architectures). • The FP 5 project GRIP began looking at the question of how resource requests could be handled from Unicore to Globus and the FP6 project takes this work forward into the world of service-based architectures (e.g. OGSA)
Starting point - GRIP • EU Funded FP5 Project as part of Information Society Technologies Programme IST 2001-32257 • http://www.grid-interoperability.org/
A Dual Job-Space Thus we have a space of “requests” defined as a vector space of the computational needs of users over a Grid. For many jobs most of the entries in the vector will be null. We have another space of “services” who can produce “cost vectors” for costing for the user jobs (providing they can accommodate them). This is an example of a dual vector space. A strictly defined dual space is probably too rigid but can provide a basis for simulations. The abstract job requirements will need to be agreed. It may be a task for a broker to translate a job specification to a “user job” for a given Grid node.
4 - Dual Space Scalar cost in tokens 1 Job vector Cost 2 Cost vector User Job
Computational resource Computational jobs ask questions about the internal structure of the provider of computational power in a manner that an electrically powered device does not. For example, do we require specific compilers, libraries, disk resource, visualization servers? What if it goes wrong, do we get compensation? If we transfer data and methods of analysis over the Internet is it secure? A resource broker for high performance computation is a different order of complexity to a broker for an electricity supplier.
Resource Requestor and Provider Spaces • Resource requestor space (RR), in terms of what the user wants: e.g. Relocatable Weather Model, 10^6 points, 24 hours, full topography. • Resource Provider space (RP), 128 processors, Origin 3000 architecture, 40 Gigabytes Memory, 1000 Gigabytes disk space, 100 Mb/s connection. • We may even forward on requests from one resource provider to another, recasting of O3000 job in terms of IA64 cluster, gives different resource set. • Linkage and staging of different stages of workflow require environmental support, a hosting environment. • We can have multiple offers in RP space for the same RR values
Abstract Functions for a resource broker • Resource discovery, for workflows as well as single jobs. • Resource capability checking, do the offering sites have ALL necessary capability and environmental support for instantiating the workflow. • Inclusion of Quality of Service policies in the offers. • Information necessary for the negotiation between client and provider and mechanisms for ensuring contract compliance. • Document submitted to GPA-RG group of GGF.
Brokers as Virtual Organizations Users VirtualOrganisationBrokers OrganizationFirewalls SystemBrokers ComputeResources
Resource Client Broker Service Broker Service Resource Broker Service Client Resource Broker Service Client Broker Service Resource Broker Service Client Broker Service Replication Resource Client Broker Service Resource VO Layer Specialist Layer Site Layer Federated Brokering
Persistent Virtual Environments Clients Other Brokers Banking Services MetaschedulingService Broker Site Feedback Policy Manager Chargeable Schedulable GridServices Workflow Manager Resource Usage Monitor Brokering and OGSA Services
Resource Broker Resource Database TSI Network Job Supervisor Unicore Gateway Unicore Client OGSA Server A GT3/4 User Database Possible OGSA Broker • Interoperating OGSA services
Site Configuration Gateway Users Contact NJSes or Broker (for site-wide brokering) Delegate (site-wide brokering only) Delegate (site-wide brokering only) NJS Broker NJS LRC LRC IDB IDB Potential to Share (Partial?) IDBs between NJSes (CSAR Config?) TSI TSI SuppliesDynamic Datato IDB TSI
Look up signing identity IDB NJS UUDB Verify delegatedidentities Look upconfiguration Broker hosted in NJS AbstractBroker TicketManager Look up staticresources Get & check signed tickets (contracts) HostVsite Map SingleVsiteBroker WholeUsiteBroker Use R-GMA to provideinformation for all Usitecomponent hosts Delegate to Grid architecture-specificengine for local resource check Delegate to application-domain expert code R-GMA LocalResourceChecker ExpertBroker Experts may use LRC Pass untranslatable resources to Unicore resource checker UnicoreRC Globus2RC Globus3RC DWDLMExpert ICMExpert Other Look updynamicresources Look up resources Delegate resource domain translation Converts UNICORE resource requests to XPath search terms for GT3 Index Service & set of untranslatable resources to use UNCORE standard techniques upon. Compute Resource TSI GRAM MDS SimpleTranslator GT3 OntologicalTranslator Look up resources Look up translations appropriateto target Globus resource schema SimpleTranslator converts delegated UNICORE resource requests into LDAP search terms for GT2 MDS & set of untranslatable resources to use UNICORE standard techniques upon. Key: Ontology UNICORE Components EUROGRID Broker Globus Components GRIP Broker Whole-Site Broker Inheritance relation UoM Broker architecture To outside world
Broker functions • A simple Resource Check request: “Can this job run here”, checks static qualities like software resources (e.g. Gaussian98) as well as capacity resources like quotas (disk quotas, CPU, etc.) • A Quality of Service request: returns a range of turnaround time, and cost, as part of a Ticket. If the Ticket is presented (within its lifetime) with the job, the turnaround and cost estimates should be met.
Grid Resource Description Problem • Two Independent Grid Systems • Unicore (http://www.unicore.org/) • Globus (http://www.globus.org/) • Both Need to Describe Systems that run Compute Jobs • Very Different Description Languages • Unicore’s Resource model, part of the AJO Framework • Globus’s GLUE Schema (DataTAG, iVGDL) for GT2 and GT3 • For interoperability, we want to take a Unicore job and run it on Globus resources • Therefore, we need to translate the Job’s Resource Requirements between the two Systems
Methodology fortranslation servce • Address Data Transformation Issues for Translating Attributes • Find a technology that has these characteristics: • can model the two ontologies • has support for linking abstract concepts to code fragments • easily allows someone to update mappings • is appropriate for a video conferencing setting • writes modelling information to a file format that can be used by other applications • Use the data files created by the application to run the translator service.
Conclusions • Interoperability of grid resource requests is at the heart of the abstract idea of computational resource that can cross Grid domain boundaries • We wish to provide application users with seamless access to resources, they should not need to know details of the machines on which they run. • High level abstractions do not yet exist as standards, so we have to create ontologies that can translate differing modelling abstractions for Grid resources. • Our current translations lose much information in crossing between current middleware systems (e.g. Globus and Unicore).
Continuation of interoperability research Research Centre Jülich (Project manager) Consorzio Interuniversitario per il Calcolo Automatico dell’Italia Nord Orientale Fujitsu Laboratories of Europe University of Warsaw Intel GmbH University of Manchester T-Systems SfR http://www.unigrids.org
GLUE: Container Classes • GLUE has container classes that include “Computing Element”, “Cluster”, “Subcluster” and “Host”. From the heading “Representing Information”, the GLUE document indicates: “…hosts are composed into sub-clusters, sub-clusters are grouped into clusters, and then computing elements refer to one or more clusters.” • These container objects may hold any number optional auxiliary classes that actually describe the GRID features.
GLUE: Auxiliary Classes • The documentation provides few details about the nature of a Host other than that it is a “physical computing element”. Much of the meaning for Host has to be derived from what it might contain. Consider the following two valid definitions: • A Host is a physical computing element characterized by Main Memory, a Benchmark, a Network Adapter and an Operating System • A Host is a physical computing element characterized by an Architecture, a Processor and an Operating System.
Map conceptsbetween ontologies • Unicore and GLUE have different philosophies for describing resources :-( • In Unicore, the resources are described in terms of resource requests • In GLUE, resources are described in terms of the availability of resources.
Local Brokering Configurations Client Client Gateway Gateway Broker NJS Broker NJS NJS NJS R-GMA IDB TSI/Host GT3 Host Host Host Site-Wide Brokering Normal EUROGRID/GRIP Brokering
request RR space RP space request B A sync RP space RR space RP space Request referral C D Figure 1: Request from RR space at A mapped into resource providers at B and C, with C forwarding a request formulated in RR space to RP space at D. B and D synchronize at end of workflow before results returned to the initiator A. RR and RP Spaces