1 / 49

The Challenge of Interoperability: the Grid, Semantic Web and Categories

This seminar explores the challenges and solutions for achieving interoperability between various systems through integration, communication, and extensibility. It discusses the motivations, semantic problems, and complex issues in heterogeneous models. The foundation for heterogenous interoperability is also addressed.

larryhansen
Download Presentation

The Challenge of Interoperability: the Grid, Semantic Web and Categories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Challenge of Interoperability: the Grid, Semantic Web and Categories Nick Rossiter Seminar -- Northumbria University, 13th November 2002 http://computing.unn.ac.uk/staff/CGNR1/ nick.rossiter@unn.ac.uk

  2. Interoperability 1 • Interoperability: the ability to request and receive services between various systems and use their functionality. • More than data exchange. • Implies a close integration

  3. Interoperability 2 • Features: • exchange of messages and requests • use of each other’s functionality • client-server abilities • distribution • operate multiple systems as single unit • communication despite incompatibilities • extensibility and evolution

  4. Motivations 1 • Diversity of modelling techniques • Distributed businesses may exercise local autonomy in platforms • Data warehousing requires heterogeneous systems to be connected • Data mining enables new dependencies to be derived from heterogeneous collections

  5. Motivations 2 • Pervasive Computing : networks supporting many diverse nodes to be driven by users specifying policy and function. • Policy: statements governing how a solution will be achieved. Statements are derived from requirements • Function: mechanism for achieving objectives. • Mobile Computing: wireless networks

  6. Motivations 3 • E-Science • Distribution of functionality transparently across many different platforms • Grid • Layered components available in network: • Computational, information and knowledge layers and probably more.

  7. Motivations 4 • Semantic Web • Correlating web resources • Use of Resource Description Framework (RDF) and XML • Ontologies (giving ‘meanings’ of terms)

  8. Definition: Model (as it varies!) • Model: a representation of policies in a structured form according to some perceived view of reality e.g. • Relational model – world is tabular • Hierarchical model – world is tree-like • Security model – world is task-based • Object model – world is based on o-o paradigm

  9. Semantic Problems in Interoperability 1 • People call different properties by different names • People classify properties differently: • Different contexts e.g. colour is property in describing a car but a table to the paint shop. • Different normalization priorities e.g. many tables optimised for updates versus few tables optimised for searching.

  10. Semantic Problems in Interoperability 2 • People make use of facilities in different ways. For instance: • In SQL-92 can achieve uniqueness in tables by: • Defining keys • or Modifying table storage method on various properties • or Defining a unique index • and in SQL-1999 we also have: • object identifiers (oids) - So many legacy problems

  11. Constraints and Types • May differ between systems: • e.g. student ids may be held as: • integers (leading zeros removed) 65275 • integers (padded out with so many leading zeros) 0065275 • strings (fixed length) ‘0065275’ • Ids may have checksum function or not

  12. Semantic Problems in Interoperability 3 • Structural problems are bad enough. But also: • Functionality can be applied in many different ways: • Procedures or functions; • different module layout. • Rules can be in: • Model structures, model coding, procedures or application programs.

  13. Simple Problem in Interoperability 1 • Two schemas in SQL-1999 AB author char(50) author_surname char(50) author_initials char(10) title varchar(300) title varchar(200) keyword set(char(30)) keywd array(8) (char(30)) Note: homogeneous model -- both SQL-1999 -- but difficulties.

  14. Different Standards • For example -- Names: • Person(surname, first_name, ..) • or Person(first_name, surname, …) • or Person(name, …) • First two may easily be made equivalent but convention in third needs to be understood. • Note also possibilities of A.N.Other, AN Other, A N Other.

  15. Simple Problem in Interoperability 2 • Homogeneous Models • the same information may be held as attribute name, relation name or a value in different databases • e.g. fines in library; • could be held in a dedicated relation Fine(amount, borrowed_id) • or as an attribute Loan(id, isbn, date_out, fine) • or as a value Charge(1.25, ‘fine’)

  16. Complex Problems in Interoperability • Heterogeneous models • Need to relate model constructions to one another, for example: • relate classes in object-oriented to user-defined types in object-relational • All problems are magnified at this level.

  17. Handling Naming, Classifying, Meta and MetaMeta Levels A Foundation for Heterogeneous Interoperability

  18. Foundation Work • Rather neglected in database area recently • Moves by Date/Darwen/Codd to introduce ‘pure’ relational model handling objects in an orthogonal manner • Open Database Group at UNN (now 8 members) looking at object-relational foundations • Common objectives with work described here

  19. Meta Data • Meta means ‘about’ • The basis of schema integration • Sometimes treated as an object (MOF - Meta Object Facility) • Better viewed as a relationship: • Name (data files) • Classify (database classes) • Meta (data dictionary) • MetaMeta (classify data dictionary )

  20. Bottom Level -- Data Values (tuples) • Student information: • <123423, ‘John Smith’> • <234568, ‘Fred Jones’> • Module Information: • <‘CG156’, ‘Database Design’> • <‘CG065’, ‘Databases’> • Which students take which modules: • <123423, ‘CG156’> • <234568, ‘CG065’>

  21. Naming Data Values • Student<id, name> Name {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} • Module<code, title> Name {<‘CG156’, ‘Database Design’>, <‘CG065’, ‘Databases’>} • Modules_taken<id, code>: Name {<123423, ‘CG156’>, <234568, ‘CG065’>} Name associates each value with a name

  22. Relating Named Values to Types -- Classification • Named application data: • Schema data (types including constraints, rules) • Student<id, name> • Name • {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} Classify Student ( id integer, name char(50), primary key id, constraint ‘id format’) Classify associates each name with a type in the schema

  23. Relating Types to Constructs - Data Dictionary • Classified schema data (types): • Constructions (e.g. table, column, function, key): Student ( id integer, name char(50), primary key id, constraint ‘id format’) Meta {Table, Column, Function, Primary Key, ...} Meta associates each schema type with a construction e.g. Student to Table, id to Column, id to Primary Key

  24. Relating Constructs to Abstractions • Constructions (e.g. table) • Abstractions (of real-world, open-ended, concepts) {Table, Column, Function, Foreign Key, ..} MetaMeta {Aggregation, inheritance, association, …} MetaMeta associates each construction with an abstraction e.g. Table to Aggregation, Foreign Key to Association

  25. Mappings are two-way MetaMetaPolicy Meta Organize Classify Instantiate Concepts Constructs Schema Types Named Data Values Downward arrows are intension-extension pairs

  26. Formalising the Architecture • Requirements: • mappings within levels and across levels • bidirectional mappings • closure at top level • open-ended logic • relationships (product and coproduct) • Candidate: category theory as used in mathematics as a workspace for relating different constructions

  27. Choice: category theory • Requirements: • mappings within levels and across levels • arrows: function, functor, natural transformation • bidirectional mappings • adjunctions • closure at top level • four levels of arrow, closed by natural transformation • open-ended logic • Heyting intuitionism • relationships (product and coproduct) • Cartesian-closed categories (like 2NF): pullback and pushout

  28. Categories • Each level is represented by a category: • Named data values by DATA (DT) • value name • Schema types by SCHEMA (SM) • Constructions by CONSTRUCTS (CS) • Concepts by CONCEPTS (CC) Red font -- categories

  29. Functors • Relationships between categories at adjacent levels are given by a functor • For example: • Meta: SCHEMACONSTRUCTS • Meta is afunctor Blue font -- functors

  30. Levels in Functorial Terms MetaMeta • CONCEPTSCONSTRUCTS SystemPolicy ModelMeta InstantiateOrganize • DATASCHEMA Classify Green font - composed functors: System = MetaMeta o Meta o Classify

  31. Composition of Adjoint Functors • Classify -- C Meta -- M • MetaMeta -- A • Policy -- P Organise -- O • Instantiate -- I • CCCSSMDT P O I A M C Composed adjunction

  32. Adjunctions • The adjointness between two functors is given by a 4-tuple e.g. for • CCCS • <P, A, , > •  unit of adjunction measures change from initial cc to cc obtained by following P and A (1CCAP(cc) ) •  counit of adjunction measures PA(cs) 1cs • Unit and counit give measure of creativity of arrows and preservation of style in mapping by functors. • If complete preservation of style ( =1) and no creativity (=0) -- isomorphism. P A

  33. Composed Adjunction for Four Levels Represents complex mappings across the levels of the system

  34. Benefits of Approach • Can represent relationships between levels, either: • abstractly with one relationship from top to bottom levels • in much more detail with all combinations of adjoints expressed.

  35. Comparing one System with Another CCCSSMDT CCCS´SM´DT´ P O I    P´ O ´ I ´ ,,  are natural transformations (comparing functors)

  36. Godement Calculus • Rules showing: • composition of functors and natural transformations is associative • natural transformations can be composed with each other • For example: • (I´O´)  = I´(O´); (OP) = (O)P •   = (O) o (I´ );  = P o (O´)

  37. Four Levels are Sufficient • In category theory: • objects are identity arrows • categories are arrows from object to object • functors are arrows from category to category • natural transformations are arrows from functor to functor • An arrow between natural transformations is a composition of natural transformations, not a new level

  38. Analogous Levels for Interoperability

  39. The Grid • Use of distributed computer power • Analogy with national power generation • Processing performed where appropriate • Requires local definitions to be used interoperably • Early stage of development

  40. Grid in Levels

  41. Databases on the Grid • Nearly all data sources are not formal database files yet (Paul Watson, 2002) • Data is in operating system files perhaps readable only by particular programs which understand the format • So level achieved is really only 3, that is program specification (no organization or policy).

  42. Semantic Web • Levels: • data values to be marked up with tags • XML document -- data with XML tags • DTD -- document structure defining tags • RDF -- triples <subject, predicate, object> • components are typically URIs, could also be literals, collections of URIs or another RDF • Semantic Web -- semantic interoperability between data sources using agents to explore RDFs and ontologies

  43. Semantic Web as Formal Levels

  44. Semantic Web is rather Flat • Only handles two bottom levels of architecture • Aims to achieve comparison of mappings ( ) through: • an agent exploring RDFs • each RDF relates one URI to another URI in the context of a predicate • ontologies are used to explain and synchronise meanings of terms • An AI approach rather than a data-driven one

  45. Discussion 1 • Category theory shows that: • four levels are ideal for interoperability • more than four yields no benefits • less than four gives only local interoperability • Categorical approach provides: • an architecture for universal interoperability • a calculus (Godement) for composing mappings at any level • adjunctions for evaluating two-way mappings

  46. Discussion 2 • In industry: • Approach seems to be getting flatter • In MOF (Meta-Object Facility) which is a four-level system (top-level ‘hard-wired’), drive to bypass top levels • IRDS (Information Resource Dictionary System) – four levels but one-way mapping – not used as much as expected • ISO though are trying a four-level approach for mapping modelling languages

  47. Discussion 3 • Lack of adequate foundations may explain some of the reticence • Also some suppliers perhaps think that market dominance is the answer, making interoperability irrelevant. But legacies remain! • Categorical framework provides a coherent basis for taking this area forward. Even if the framework is translated into a more familiar notation.

  48. References 1 • MOF • Bezivin, J, From Object Composition to Model Transformation with the MDA, 3rd International Conference on Enterprise Information Systems (ICEIS), Setubal, invited paper (2001). Plus OMG web pages. • IRDS • Gradwell, D J L, The Arrival of IRDS Standards, 8th BNCOD, York 1990, Pitman 196-209 (1990). • GRID • Watson, P, Databases and the Grid, Computing Science Technical Report no.755, University of Newcastle upon Tyne (2002), (16pp). • SEMANTIC WEB: W3C web pages • Our work (available from NR’s home page) • Heather, M A, & Rossiter, B N, The Anticipatory and Systemic Adjointness of E-Science Computation on the Grid, Computing Anticipatory Systems, Proceedings CASYS`01, Liège, Dubois, D M, (ed.), AIP Conference Proceedings 627 565-574 (2002). • Rossiter, B N, Heather, M A, & Nelson, D A, A Universal Technique for Relating Heterogeneous Data Models, 3rd International Conference on Enterprise Information Systems (ICEIS), Setúbal, I 96-103 (2001). • Heather, M A, & Rossiter, B N, Constructing Standards for Cross-Platform Operation, Software Quality Journal, 7(2) 131-140 (1998).

  49. References 2 • ISO modelling: • Study Report on the Feasibility of Mapping Modelling Languages for Analysis and Design Models.ISO/IEC JTC1/SC7 N2112, 1999/04/19, http://www.info.uqam.ca/Labo\_Recherche/Lrgl/sc7/N2101-N2150/07n2112.pdf (1999). • Category Theory and Computing Science: • Barr, M, & Wells, C, Category Theory for Computing Science, Prentice-Hall (1990). • Mac Lane, S, Categories for the Working Mathematician, Springer, 2nd ed (1998). • Category Theory and Information Systems: some other workers • Zinovy Diskin (USA, formerly Latvia) • Boris Cadish (Latvia) • Robert Rosebrugh (Canada) • Michael Johnson (Australia) • Christopher Dampney (Australia) • Michael Heather (Northumbria) • David Nelson (Sunderland) • Arthur ter Hofstede (Australia, formerly Holland) • Many other workers on category theory and program semantics

More Related