350 likes | 472 Views
The Grid versus O.G.C. What happens next?. Bryan Lawrence Director of Environmental Data Archival and Associated Research, CCLRC Head of the British Atmospheric Data Centre, NCAS. Outline. Clearly I don’t know what happens next! However, I intend to discuss: Where I’m coming from
E N D
The Grid versus O.G.C.What happens next? Bryan Lawrence Director of Environmental Data Archival and Associated Research, CCLRC Head of the British Atmospheric Data Centre, NCAS
Outline Clearly I don’t know what happens next! However, I intend to discuss: • Where I’m coming from • A brief comparison of Grid and OGC paradigms • Some Strengths and Weaknesses of the paradigms • Some crystal ball gazing
Where I’m coming from The British Atmospheric Data Centre, The NERC Earth Observation Data Centre The NERC DataGrid
British Atmospheric Data Centre • Mission Statement includes Curation and Facilitation • Curation:Preserves digital atmospheric science data for posterity (including ALL NERC-funded atmospheric science data and much else). Catalogues and Documents data. Holds ~50 TB terabytes of data! • Facilitation: Provides effective access to data and information. Supports field campaigns and model development. Supports knowledge transfer from atmospheric science data to user knowledge in other domains. Knowledge Transfer: Half of the BADC users are from other fields: BADC data has been used to study bird feeding habits, radio communication modelling, A&E influenza cases, wind power research …
NERC Earth Observation Data Centre Will be part of new National Centre for Earth Observation, NCEO NERC commercial satellite data LANDSAT, SPOT, IKONOS NERC Airborne data NERC ARSF NEXTMap Britain Dedicated UK archive ATSR-1/2, ~40 TB AATSR ~80 TB Strong Liaison with ESA Working in ESA grid projects “To deliver effective services to the NERC community in locating, accessing, interpreting and exploiting Earth Observation data and information, and to ensure the long-term integrity of EO datasets produced and acquired by NERC projects and programmes”
NCAR Grid Complexity: Size and Heterogeneity British Atmospheric Data Centre http://ndg.nerc.ac.uk British Oceanographic Data Centre
Users NDG “Portal” Interface(s) Data Providers NDG Core Services NDG Architecture Vocab Services
…in a defined logical structure… …delivered through services… …and described by metadata. A geospatial dataset… …consists of features and related objects… Standards ISO 19101: Geographic Information Reference model
ISO19101 Features • Geographic ‘features’ • “abstraction of real world phenomena” [ISO 19101] • Type or instance • Encapsulate important semantics in universe of discourse • “Something you can name” • Application schema • Defines semantic content and logical structure • ISO standards provide toolkit: • spatial/temporal referencing • geometry (1-, 2-, 3-D) • topology • dictionaries (phenomena, units, etc.) • GML – canonical encoding [from ISO 19109 “Geographic information – Rules for Application Schema”]
A brief Comparison of Philosophy Grids The OGC
The Grid The Grid as defined by Foster1 has three main characteristics: • standard open protocols, • no centralised control, and • non-trivial quality of service. The lack of central control requires secure access to grid resources, typically implemented though public key infrastructures. In practice, nearly all implementations of grids are themselves are Virtual Organisations, with some central control over authorisation and authentication! 1http://www.gridtoday.com/02/0722/100136.html
The OGC web services Existing and planned OGC web services are open protocols designed for the situation where there is no centralised control, and are intended to provide non-trivial quality of service. The OGC has been developing Web Service interfaces over a number of years. I would assert that the OGC has been developing GRID components over a number of years. So what’s the difference?
Directions Grid World • “Interesting” access control paradigms coupled with Virtual Organisations • “Complex” workflows coupled with workflow management tools • “Simple” data objects, or if not simple, homogeneous. • Tight-binding in strongly-typed service descriptions OGC World • Thus far little access control (but GeoDRM looming) • Negligible orchestration of workflow (WPS is “internal” workflow). Registry formalism may change things. • Standards support complex data objects, but implementations support simple data objects and relatively-weak binding.
The Collision In recent years, the OGC has commissioned work on producing SOAP-based implementations of the existing OGC web services as well as further developing the suite to include processing services. In practice then, the geospatial community may regard the development of “grid” principles as essentially the re-development and wider deployment of concepts already prevalent in their community. Of course the risk is also that the geospatial community is busy reinventing wheels that the wider community has already refined …
Strengths and Weaknesses Or the essential trade off between the rapid development of functionality versus interoperability … in each world!
Metadata in OGC One of the major identifying characteristics of OGC web services is that they (will) provide identifiers to datasets and services identified using the citation element of standardised metadata constructs (ISO19115). One of the major weaknesses of that approach is that ISO19139 (the xml implementation of ISO19115) is immature. Communities have not grasped the concept of profiles properly, and the tooling is even more immature. Fortunately, one can get a long way without ISO19139, but orchestrating services is going to require service description metadata!
Metadata in the Grid No real concept of metadata for data objects beyond “we can use OGSA/DAI for that”! A plethora of methods of describing services, and a plethora of methods for orchestrating services, but no real way of describing what the characteristics of of the data objects which services might manipulate. The goal of “agent-based” service negotiation is still a pipe dream – particularly if you want the service to do something with real data!
Security in the OGC-space • We can use https to secure the transport layer … • We can secure access to servers, but access-control granularity behind OGC services is difficult to engineer in an interoperable and scalable way! • Without something extra (e.g. NDG security) the BADC could not deploy the OGC services!) • It turns out that OWS4 is doing something very similar!
Security in the Grid-world Strong access control based around virtual-organisations, but mostly need common authentication/authorisation governance within the V.O. • Not really good enough for true interoperability, and not yet enough to deal with NDG requirements • We’re bound to deploy Shibboleth and probably OpenID as well, but that’s only authentication. • Role based authorisation requires new infrastructure. PERMIS and other projects lead the way, but are not yet easily deployable. Significantly better than what’s available in the OGC-space, and on the right track …
On Service Description One of the key areas where both OGC and the Grid world have been active is in service description. I’m not that familiar with what the OGC has done1 for the following reason: • The rest of the IT world is doing SOA • The rest of the IT world will describe services. • It’s unlikely that the OGC will develop best practice which is well supported by software vendor tooling … • At the moment WSDL2 is where we are investing our thinking time … 1 ISO19119 is a meta-model for services rather than a SDL
On religious wars (1) SOAP versus REST or POX (plain old XML)! • OGC web services are POX. • Most of the grid is SOAPacious. • Most of the “Enterprise” world thinks they should use SOAP. • Most Successful “Web 2.0” use REST. • Nearly everyone thinks their way is the “one true way” (although see Ian Foster’s blog on “Fundamentalism”) I suspect that much of the argument is based around different requirements!
On Religious Wars (2) Fielding (who invented REST): “Some architectural styles are often portrayed as “silver bullet” solutions for all forms of software. However, a good designer should select a style that matches the needs of the particular problem being solved. Choosing the right architectural style for a network-based application requires an understanding of the problem domain and thereby the communication needs of the application, an awareness of the variety of architectural styles and the particular concerns they address, and the ability to anticipate the sensitivity of each interaction style to the characteristics of network-based communication.” Minar (implementor of the Google Soap Interface)”: “The deeper problem with SOAP is strong typing … Gregorio (co-inventor of the Atom Protocol): “This backs up my experience; if you don't have control of both ends of the wire then loosely typed documents beat strongly typed data-structure serializations.”
On religious wars (3) And for the grid/OGC collision? The Grid World: Are by and large are building things where the virtual organization’s members are in each other’s pockets. They dohave control of both ends of the wire. They do have complex situations, where automatically parased service description languages do help. The tooling does mostly work. The OGC world : Don’t have control of the wire. Loose typing helps. REST and POX make sense. Minar again: “Truly, none of this protocol fiddling matters. Just do something that works” My collision perspective? OGC services have grown SOAP bindings. Grid services can grow REST bindings. WSDL2 (and other service description languages) can describe both! We will use what we need to get the job done, the information model including the interface descriptions is what matters, the bindings will follow. This means domain modellers who “think GML” will have to go back to ISO first principles to think about objects as more than just data types, as they are things with interfaces! It also means the Grid people need to think harder about the implication of the datatypes on their interface descriptions!
Crystal Ball Gazing The JISC-OGC projects Whither NDG Whither everyone else?
JISC-CALL 2006 call for projects to work in the area of OGC-Grid Collision. Two major initiatives: • Bringing grid security to the OGC world. • Bringing grid workflow concepts to the OGC world. (Implicitly bringing complex data-types aka feature-types into the thinking of the grid-world) Funded • SEE-GEO: SEcurE access to GEOspatial services • SAW-GEO: Semantically Aware Workflow Engines for GEOspatial Web Service Orchestration
SEE-GEO Aiming for access to geographic information via the National Grid Service Exact project detail not yet clear, but needing to address: • The role of Shibboleth (it is becoming the UK standard) • How WS-Security can be exploited • How to interface with the Globus-2 era NGS. • How the OGC web services are involved. Main deliverables will be a report and 3 demonstrators: • National datacentre • Social Science (NCeSS) • Orchestration (Newcastle) Project will involve integrating the WCS into OGSA/DAI SEE-GEO Slide adapted from material from Chris Higgins
SAW-GEO Proposed Architecture • Chaining multiple web services together • Semantically informed workflow management system and workflow engine • Workflow engine deployable onto Apache Tomcat • Web portal into the workflow engine • Use of OGSA-DAI wrappers and the Globus toolkit Workflow engines • Wide range of possible workflow engines will be examined • eg. TAVERNA, Apache Agila, ActiveBPEL • Newcastle has experience with SCUFL (TAVERNA), but it is application-specific, so are likely to use BPEL (supported by ActiveBPEL, Agila etc). • Which has a graphical capability and a web-based console running on apache Tomcat which meets the objectives above. SAW-GEO Slides adapted from material from David Fairbairn
MapServer or GeoServer SAW-GEO Globus Toolkit OGC WCS OGSA-DAI WCS Workflow Management System Clients OGC WFS OGSA-DAI WFS OGC WMS OGSA-DAI WMS EDINANewcastle SAW-GEO BNL: Not obviously the right methodology!! SAW-GEO Slides adapted from material from David Fairbairn
NDG Futures NDG2 is doing nothing “new” now (project due to end Sep07). Consolidating what we know: • No role for OGSA/DAI • Until the feature-type perspective is integral it is, for our purposes, little more than a WS-JDBC! • Will be exploiting OGC feature-type perspective significantly • Development of CSML (an application schema of GML, along with some required standards enhancements) • Services based on CSML (WCS, WFS, WMS) • Have rolled our own security paradigm • Based on PKI, WS-Security, a local attribute-syntax for XML attribute certifcates and similar concepts to Shibboleth. Role-Mapping. Works now!
What would NDG3 consider? Our CSML2 domain modelling does include interface descriptions, so exploiting those in service registries will be important. Definitely want to support orchestration, and late-binding of services! Key technologies: • WSDL2 • ebRIM registries • SAW-GEO outputs in terms of BPEL, not at all convinced about OGSA/DAI! • Migration to shibboleth, XACML.
The next three years? What is everyone else thinking? The questions still remain, • Can we exploit the OGC Feature-type perspective, while • Keeping simple interfaces, but • Managing complex orchestration, of • Well described services, in a • Secure Manner Requires: • Semantic Tools, • Confronting implications of architectural paradigms (SOAP/REST) • Tooling (e.g. BPEL example from SAW-GEO) • Service and Data Metadata • Simpler Security • WS-Security equivalent for POX? Tighter coupling of security concepts into WSDL • Explicit recognition of access control beyond http authentication in OWS-common.
Summary • OGC knows about data typing! • Feature-type concepts are not limited to geospatial! • Grid community need to understand implications of semantic data typing on service descriptions. • GRID has more sophisticated service binding, access control and authentication, workflow! • OGC community should not reinvent tooling! • Architectural decisions need to be based on pragmatic decisions about necessity for strong/weak typing and governance: • It is RIGHT to use strong-typing and tools (e.g. WSRF, SOAP) when the problem will benefit from doing so. • It is RIGHT to use weak-typing, late-binding, and REST when the problem will benefit from doing so. • Neither are silver bullets!