180 likes | 356 Views
Service-Based Distributed Query Processing on the Grid. Declarative Grid Service Orchestration with OGSA-DQP. Alvaro A A Fernandes Department of Computer Science University of Manchester. places, people, funding, projects. Manchester M Nedim Alpdemir Anastasios Gounaris Norman W Paton
E N D
Service-Based Distributed Query Processing on the Grid Declarative Grid Service Orchestration with OGSA-DQP Alvaro A A Fernandes Department of Computer Science University of Manchester Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
places, people, funding, projects Manchester M Nedim Alpdemir Anastasios Gounaris Norman W Paton Alvaro A A Fernandes Rizos Sakellariou Newcastle upon Tyne Arijit Mukherjee Jim Smith Paul Watson Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
Pull by applications: overwhelming amounts of semantically complex data in very diverse, structurally dissimilar, and autonomous, geographically dispersed data sources requiring computationally demanding analysis. Push from context and infrastructure: Web service impetus combined with Grid abstractions and protocols that enable, not just dynamic resource discovery but also, dynamic resource allocation and use. motivation Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
context • High-level data access and integration services are needed if applications that have data with complex structure and complex semantics are to benefit from the Grid. • Standards for data access are emerging, and middleware products that are reference implementations of such standards are already available. • Distributed query processing technology is one approach to delivering (1.) given the availability of (2.). • Declarative service orchestration falls out. Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
OGSA-DQP uses a middleware approach. It can be seen as a mediator over OGSA-DAI wrappers. It promises bottom-lines regarding: efficiency: “leave to it to schedule in parallel”; effectiveness: “leave to it to orchestrate your services”; usability: “use it as a Grid data service”. DBMS data OGSA-DQPapproach Query Results OGSA-DQP OGSA-DAI OGSA-DAI DBMS data Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
Given two DBMSs and one analysis tool (e.g., a WS): proteinTerm to a GO Gene Ontology running as a remote mySQL DB, protein to a GIMS Genome Warehouse running as a remote ODMG-compliant DB, Blast (sequence alignment scoring); We can obtain alignment scores for a sequence against proteins of a certain kind: select p.proteinId, Blast(p.sequence) from protein p, proteinTerm t where t.termId = ‘GO:0005942’ and p.proteinId = t.proteinId reduce 3,4 op_call(Blast) exchange 2 hash_join (proteinId) exchange exchange 5 reduce reduce index_scan termId=GO:0005942 (proteinTerm) table_scan (protein) 1 OGSA-DQPexample • Then, OGSA-DQP acts as an enactor of a declarative orchestration of services on the Grid: Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
extends Leonidas Fegaras’s -DB system and OPTGEN optimiser generator. [1997-2000] Polar: a parallel query processing engine. [1998-2001] Polar*: an MPICH-G distributed extension of Polar. [2002] depends on OGSA/OGSI/GT3 Grid Services (GSs). OGSA-DAI Grid Data Services (GDSs). Leonidas Fegaras and David Maier’s work on a formal semantics for OQL. [TODS 25(4),2000] OGSA-DQPextends/depends on Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
provides Grid Distributed Query Services (GDQSs) that: interact with clients; find and retrieve service descriptions; parse, compile, partition and schedule the query execution over a union of distributed data sources. The query plan is an orchestration of GQESs manages Grid Query Evaluation Services (GQESs) that: implement the physical query algebra; implement the query execution model and semantics; run a partition of a query execution plan generated by a GDQS; interact with other GQESs/GDSs/WSs but not with clients. OGSA-DQPmanages/provides Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
It builds upon GDSs which build upon GSs. A GDS is a leaf in a query execution plan up from which data ultimately flows. Data resources are, thereby, virtualised. Since they are GSs, they can be dynamically created by dynamically discovered factories and then disposed of. A GDQS is a GDS capable of integration and distributed retrieval and analysis of data. To perform a request a GDQS spawns as many GQESs in as many hosts as the partitioning and scheduling policies of the GDQS recommend for that request. OGSA-DQPa brief tour (1) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
To obtain an execution plan, a GDQS: Interacts with registries to fetch information about the data and computational services deemed of interest by the requestor; Interacts with GDSs and (in future) Index Services to acquire relevant metadata; Compiles, optimises, partitions and schedules the query execution. OGSA-DQPa brief tour (2) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
Given a distributed query plan, a GDQS: Interacts with GDS factories to create the leaf services in the plan; Interacts with WSs that front-end analysis capabilities; Commands the creation of GQESs as stipulated by the partitioning and scheduling decided on by the compiler; Coordinates the GQESs into executing the plan. OGSA-DQPa brief tour (3) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
what is going on behind the scenes (1) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
what is going on behind the scenes (2) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
the Khalaf-Leymann taxonomy for web services aggregation aggregation unconstrained constrained grouping recursive wiring choreography service domains agreements Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
There is interface inheritance from GSs and GDSs. The execution plan can be seen as encapsulating a wiring of GQESs, But constrained, and constructed on-the-fly, as in an an orchestration. As in service domains, there is competition of GQESs for a role to play in the orchestration. As is agreements, the orchestration is opportunistic, responsive to the obtaining resource levels and short-lived. OGSA-DQPvarious kinds of service aggregation Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
summary • OGSA-DQP is a service-based distributed query processor for the Grid that is: • Exposed as a service; • Implemented as an orchestration of services. • OGSA-DQP is an enactor of declarative Grid service orchestrations that: • Improves on Grid portals when only retrieval and analysis is involved; • Fills the gap left by the lack of a service orchestration framework in the OGSA. Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
where to find out more: papers • M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. An Experience Report on Designing and Building OGSA-DQP: A Service Based Distributed Query Processor for the Grid. GGF9 Workshop on Designing and Building Grid Services, 2003. • M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. Service-Based Distributed Querying on the Grid. 1st Int. Conf. on Service Oriented Computing, 2003. LNCS, to appear • M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. OGSA-DQP: A Service-Based Distributed Query Processor for the Grid. 2nd UK e-Science All Hands Meeting, 2003. • J Smith, A Gounaris, P Watson, N W Paton, A A A Fernandes, R Sakellariou. Distributed Query Processing on the Grid. GRID 2002, LNCS 2536 (papers available from http://www.cs.man.ac.uk/~alvaro/publications.html ) Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
where to find out more: software OGSA-DQP Grid middleware to query distributed data sources www.ogsadai.org.uk/dqp OGSA-DAI Grid middleware to interface with data(bases) www.ogsadai.org.uk/ Globus Toolkit Open-source implementation of OGSA/OGSI www.globustoolkit.org/ Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)