490 likes | 655 Views
The Challenge of Interoperability: the Grid, Semantic Web and Categories. Nick Rossiter Seminar -- Northumbria University, 13th November 2002 http://computing.unn.ac.uk/staff/CGNR1/ nick.rossiter@unn.ac.uk. Interoperability 1. Interoperability:
E N D
The Challenge of Interoperability: the Grid, Semantic Web and Categories Nick Rossiter Seminar -- Northumbria University, 13th November 2002 http://computing.unn.ac.uk/staff/CGNR1/ nick.rossiter@unn.ac.uk
Interoperability 1 • Interoperability: the ability to request and receive services between various systems and use their functionality. • More than data exchange. • Implies a close integration
Interoperability 2 • Features: • exchange of messages and requests • use of each other’s functionality • client-server abilities • distribution • operate multiple systems as single unit • communication despite incompatibilities • extensibility and evolution
Motivations 1 • Diversity of modelling techniques • Distributed businesses may exercise local autonomy in platforms • Data warehousing requires heterogeneous systems to be connected • Data mining enables new dependencies to be derived from heterogeneous collections
Motivations 2 • Pervasive Computing : networks supporting many diverse nodes to be driven by users specifying policy and function. • Policy: statements governing how a solution will be achieved. Statements are derived from requirements • Function: mechanism for achieving objectives. • Mobile Computing: wireless networks
Motivations 3 • E-Science • Distribution of functionality transparently across many different platforms • Grid • Layered components available in network: • Computational, information and knowledge layers and probably more.
Motivations 4 • Semantic Web • Correlating web resources • Use of Resource Description Framework (RDF) and XML • Ontologies (giving ‘meanings’ of terms)
Definition: Model (as it varies!) • Model: a representation of policies in a structured form according to some perceived view of reality e.g. • Relational model – world is tabular • Hierarchical model – world is tree-like • Security model – world is task-based • Object model – world is based on o-o paradigm
Semantic Problems in Interoperability 1 • People call different properties by different names • People classify properties differently: • Different contexts e.g. colour is property in describing a car but a table to the paint shop. • Different normalization priorities e.g. many tables optimised for updates versus few tables optimised for searching.
Semantic Problems in Interoperability 2 • People make use of facilities in different ways. For instance: • In SQL-92 can achieve uniqueness in tables by: • Defining keys • or Modifying table storage method on various properties • or Defining a unique index • and in SQL-1999 we also have: • object identifiers (oids) - So many legacy problems
Constraints and Types • May differ between systems: • e.g. student ids may be held as: • integers (leading zeros removed) 65275 • integers (padded out with so many leading zeros) 0065275 • strings (fixed length) ‘0065275’ • Ids may have checksum function or not
Semantic Problems in Interoperability 3 • Structural problems are bad enough. But also: • Functionality can be applied in many different ways: • Procedures or functions; • different module layout. • Rules can be in: • Model structures, model coding, procedures or application programs.
Simple Problem in Interoperability 1 • Two schemas in SQL-1999 AB author char(50) author_surname char(50) author_initials char(10) title varchar(300) title varchar(200) keyword set(char(30)) keywd array(8) (char(30)) Note: homogeneous model -- both SQL-1999 -- but difficulties.
Different Standards • For example -- Names: • Person(surname, first_name, ..) • or Person(first_name, surname, …) • or Person(name, …) • First two may easily be made equivalent but convention in third needs to be understood. • Note also possibilities of A.N.Other, AN Other, A N Other.
Simple Problem in Interoperability 2 • Homogeneous Models • the same information may be held as attribute name, relation name or a value in different databases • e.g. fines in library; • could be held in a dedicated relation Fine(amount, borrowed_id) • or as an attribute Loan(id, isbn, date_out, fine) • or as a value Charge(1.25, ‘fine’)
Complex Problems in Interoperability • Heterogeneous models • Need to relate model constructions to one another, for example: • relate classes in object-oriented to user-defined types in object-relational • All problems are magnified at this level.
Handling Naming, Classifying, Meta and MetaMeta Levels A Foundation for Heterogeneous Interoperability
Foundation Work • Rather neglected in database area recently • Moves by Date/Darwen/Codd to introduce ‘pure’ relational model handling objects in an orthogonal manner • Open Database Group at UNN (now 8 members) looking at object-relational foundations • Common objectives with work described here
Meta Data • Meta means ‘about’ • The basis of schema integration • Sometimes treated as an object (MOF - Meta Object Facility) • Better viewed as a relationship: • Name (data files) • Classify (database classes) • Meta (data dictionary) • MetaMeta (classify data dictionary )
Bottom Level -- Data Values (tuples) • Student information: • <123423, ‘John Smith’> • <234568, ‘Fred Jones’> • Module Information: • <‘CG156’, ‘Database Design’> • <‘CG065’, ‘Databases’> • Which students take which modules: • <123423, ‘CG156’> • <234568, ‘CG065’>
Naming Data Values • Student<id, name> Name {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} • Module<code, title> Name {<‘CG156’, ‘Database Design’>, <‘CG065’, ‘Databases’>} • Modules_taken<id, code>: Name {<123423, ‘CG156’>, <234568, ‘CG065’>} Name associates each value with a name
Relating Named Values to Types -- Classification • Named application data: • Schema data (types including constraints, rules) • Student<id, name> • Name • {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} Classify Student ( id integer, name char(50), primary key id, constraint ‘id format’) Classify associates each name with a type in the schema
Relating Types to Constructs - Data Dictionary • Classified schema data (types): • Constructions (e.g. table, column, function, key): Student ( id integer, name char(50), primary key id, constraint ‘id format’) Meta {Table, Column, Function, Primary Key, ...} Meta associates each schema type with a construction e.g. Student to Table, id to Column, id to Primary Key
Relating Constructs to Abstractions • Constructions (e.g. table) • Abstractions (of real-world, open-ended, concepts) {Table, Column, Function, Foreign Key, ..} MetaMeta {Aggregation, inheritance, association, …} MetaMeta associates each construction with an abstraction e.g. Table to Aggregation, Foreign Key to Association
Mappings are two-way MetaMetaPolicy Meta Organize Classify Instantiate Concepts Constructs Schema Types Named Data Values Downward arrows are intension-extension pairs
Formalising the Architecture • Requirements: • mappings within levels and across levels • bidirectional mappings • closure at top level • open-ended logic • relationships (product and coproduct) • Candidate: category theory as used in mathematics as a workspace for relating different constructions
Choice: category theory • Requirements: • mappings within levels and across levels • arrows: function, functor, natural transformation • bidirectional mappings • adjunctions • closure at top level • four levels of arrow, closed by natural transformation • open-ended logic • Heyting intuitionism • relationships (product and coproduct) • Cartesian-closed categories (like 2NF): pullback and pushout
Categories • Each level is represented by a category: • Named data values by DATA (DT) • value name • Schema types by SCHEMA (SM) • Constructions by CONSTRUCTS (CS) • Concepts by CONCEPTS (CC) Red font -- categories
Functors • Relationships between categories at adjacent levels are given by a functor • For example: • Meta: SCHEMACONSTRUCTS • Meta is afunctor Blue font -- functors
Levels in Functorial Terms MetaMeta • CONCEPTSCONSTRUCTS SystemPolicy ModelMeta InstantiateOrganize • DATASCHEMA Classify Green font - composed functors: System = MetaMeta o Meta o Classify
Composition of Adjoint Functors • Classify -- C Meta -- M • MetaMeta -- A • Policy -- P Organise -- O • Instantiate -- I • CCCSSMDT P O I A M C Composed adjunction
Adjunctions • The adjointness between two functors is given by a 4-tuple e.g. for • CCCS • <P, A, , > • unit of adjunction measures change from initial cc to cc obtained by following P and A (1CCAP(cc) ) • counit of adjunction measures PA(cs) 1cs • Unit and counit give measure of creativity of arrows and preservation of style in mapping by functors. • If complete preservation of style ( =1) and no creativity (=0) -- isomorphism. P A
Composed Adjunction for Four Levels Represents complex mappings across the levels of the system
Benefits of Approach • Can represent relationships between levels, either: • abstractly with one relationship from top to bottom levels • in much more detail with all combinations of adjoints expressed.
Comparing one System with Another CCCSSMDT CCCS´SM´DT´ P O I P´ O ´ I ´ ,, are natural transformations (comparing functors)
Godement Calculus • Rules showing: • composition of functors and natural transformations is associative • natural transformations can be composed with each other • For example: • (I´O´) = I´(O´); (OP) = (O)P • = (O) o (I´ ); = P o (O´)
Four Levels are Sufficient • In category theory: • objects are identity arrows • categories are arrows from object to object • functors are arrows from category to category • natural transformations are arrows from functor to functor • An arrow between natural transformations is a composition of natural transformations, not a new level
The Grid • Use of distributed computer power • Analogy with national power generation • Processing performed where appropriate • Requires local definitions to be used interoperably • Early stage of development
Databases on the Grid • Nearly all data sources are not formal database files yet (Paul Watson, 2002) • Data is in operating system files perhaps readable only by particular programs which understand the format • So level achieved is really only 3, that is program specification (no organization or policy).
Semantic Web • Levels: • data values to be marked up with tags • XML document -- data with XML tags • DTD -- document structure defining tags • RDF -- triples <subject, predicate, object> • components are typically URIs, could also be literals, collections of URIs or another RDF • Semantic Web -- semantic interoperability between data sources using agents to explore RDFs and ontologies
Semantic Web is rather Flat • Only handles two bottom levels of architecture • Aims to achieve comparison of mappings ( ) through: • an agent exploring RDFs • each RDF relates one URI to another URI in the context of a predicate • ontologies are used to explain and synchronise meanings of terms • An AI approach rather than a data-driven one
Discussion 1 • Category theory shows that: • four levels are ideal for interoperability • more than four yields no benefits • less than four gives only local interoperability • Categorical approach provides: • an architecture for universal interoperability • a calculus (Godement) for composing mappings at any level • adjunctions for evaluating two-way mappings
Discussion 2 • In industry: • Approach seems to be getting flatter • In MOF (Meta-Object Facility) which is a four-level system (top-level ‘hard-wired’), drive to bypass top levels • IRDS (Information Resource Dictionary System) – four levels but one-way mapping – not used as much as expected • ISO though are trying a four-level approach for mapping modelling languages
Discussion 3 • Lack of adequate foundations may explain some of the reticence • Also some suppliers perhaps think that market dominance is the answer, making interoperability irrelevant. But legacies remain! • Categorical framework provides a coherent basis for taking this area forward. Even if the framework is translated into a more familiar notation.
References 1 • MOF • Bezivin, J, From Object Composition to Model Transformation with the MDA, 3rd International Conference on Enterprise Information Systems (ICEIS), Setubal, invited paper (2001). Plus OMG web pages. • IRDS • Gradwell, D J L, The Arrival of IRDS Standards, 8th BNCOD, York 1990, Pitman 196-209 (1990). • GRID • Watson, P, Databases and the Grid, Computing Science Technical Report no.755, University of Newcastle upon Tyne (2002), (16pp). • SEMANTIC WEB: W3C web pages • Our work (available from NR’s home page) • Heather, M A, & Rossiter, B N, The Anticipatory and Systemic Adjointness of E-Science Computation on the Grid, Computing Anticipatory Systems, Proceedings CASYS`01, Liège, Dubois, D M, (ed.), AIP Conference Proceedings 627 565-574 (2002). • Rossiter, B N, Heather, M A, & Nelson, D A, A Universal Technique for Relating Heterogeneous Data Models, 3rd International Conference on Enterprise Information Systems (ICEIS), Setúbal, I 96-103 (2001). • Heather, M A, & Rossiter, B N, Constructing Standards for Cross-Platform Operation, Software Quality Journal, 7(2) 131-140 (1998).
References 2 • ISO modelling: • Study Report on the Feasibility of Mapping Modelling Languages for Analysis and Design Models.ISO/IEC JTC1/SC7 N2112, 1999/04/19, http://www.info.uqam.ca/Labo\_Recherche/Lrgl/sc7/N2101-N2150/07n2112.pdf (1999). • Category Theory and Computing Science: • Barr, M, & Wells, C, Category Theory for Computing Science, Prentice-Hall (1990). • Mac Lane, S, Categories for the Working Mathematician, Springer, 2nd ed (1998). • Category Theory and Information Systems: some other workers • Zinovy Diskin (USA, formerly Latvia) • Boris Cadish (Latvia) • Robert Rosebrugh (Canada) • Michael Johnson (Australia) • Christopher Dampney (Australia) • Michael Heather (Northumbria) • David Nelson (Sunderland) • Arthur ter Hofstede (Australia, formerly Holland) • Many other workers on category theory and program semantics