490 likes | 499 Views
This seminar explores the challenges and solutions for achieving interoperability between various systems through integration, communication, and extensibility. It discusses the motivations, semantic problems, and complex issues in heterogeneous models. The foundation for heterogenous interoperability is also addressed.
E N D
The Challenge of Interoperability: the Grid, Semantic Web and Categories Nick Rossiter Seminar -- Northumbria University, 13th November 2002 http://computing.unn.ac.uk/staff/CGNR1/ nick.rossiter@unn.ac.uk
Interoperability 1 • Interoperability: the ability to request and receive services between various systems and use their functionality. • More than data exchange. • Implies a close integration
Interoperability 2 • Features: • exchange of messages and requests • use of each other’s functionality • client-server abilities • distribution • operate multiple systems as single unit • communication despite incompatibilities • extensibility and evolution
Motivations 1 • Diversity of modelling techniques • Distributed businesses may exercise local autonomy in platforms • Data warehousing requires heterogeneous systems to be connected • Data mining enables new dependencies to be derived from heterogeneous collections
Motivations 2 • Pervasive Computing : networks supporting many diverse nodes to be driven by users specifying policy and function. • Policy: statements governing how a solution will be achieved. Statements are derived from requirements • Function: mechanism for achieving objectives. • Mobile Computing: wireless networks
Motivations 3 • E-Science • Distribution of functionality transparently across many different platforms • Grid • Layered components available in network: • Computational, information and knowledge layers and probably more.
Motivations 4 • Semantic Web • Correlating web resources • Use of Resource Description Framework (RDF) and XML • Ontologies (giving ‘meanings’ of terms)
Definition: Model (as it varies!) • Model: a representation of policies in a structured form according to some perceived view of reality e.g. • Relational model – world is tabular • Hierarchical model – world is tree-like • Security model – world is task-based • Object model – world is based on o-o paradigm
Semantic Problems in Interoperability 1 • People call different properties by different names • People classify properties differently: • Different contexts e.g. colour is property in describing a car but a table to the paint shop. • Different normalization priorities e.g. many tables optimised for updates versus few tables optimised for searching.
Semantic Problems in Interoperability 2 • People make use of facilities in different ways. For instance: • In SQL-92 can achieve uniqueness in tables by: • Defining keys • or Modifying table storage method on various properties • or Defining a unique index • and in SQL-1999 we also have: • object identifiers (oids) - So many legacy problems
Constraints and Types • May differ between systems: • e.g. student ids may be held as: • integers (leading zeros removed) 65275 • integers (padded out with so many leading zeros) 0065275 • strings (fixed length) ‘0065275’ • Ids may have checksum function or not
Semantic Problems in Interoperability 3 • Structural problems are bad enough. But also: • Functionality can be applied in many different ways: • Procedures or functions; • different module layout. • Rules can be in: • Model structures, model coding, procedures or application programs.
Simple Problem in Interoperability 1 • Two schemas in SQL-1999 AB author char(50) author_surname char(50) author_initials char(10) title varchar(300) title varchar(200) keyword set(char(30)) keywd array(8) (char(30)) Note: homogeneous model -- both SQL-1999 -- but difficulties.
Different Standards • For example -- Names: • Person(surname, first_name, ..) • or Person(first_name, surname, …) • or Person(name, …) • First two may easily be made equivalent but convention in third needs to be understood. • Note also possibilities of A.N.Other, AN Other, A N Other.
Simple Problem in Interoperability 2 • Homogeneous Models • the same information may be held as attribute name, relation name or a value in different databases • e.g. fines in library; • could be held in a dedicated relation Fine(amount, borrowed_id) • or as an attribute Loan(id, isbn, date_out, fine) • or as a value Charge(1.25, ‘fine’)
Complex Problems in Interoperability • Heterogeneous models • Need to relate model constructions to one another, for example: • relate classes in object-oriented to user-defined types in object-relational • All problems are magnified at this level.
Handling Naming, Classifying, Meta and MetaMeta Levels A Foundation for Heterogeneous Interoperability
Foundation Work • Rather neglected in database area recently • Moves by Date/Darwen/Codd to introduce ‘pure’ relational model handling objects in an orthogonal manner • Open Database Group at UNN (now 8 members) looking at object-relational foundations • Common objectives with work described here
Meta Data • Meta means ‘about’ • The basis of schema integration • Sometimes treated as an object (MOF - Meta Object Facility) • Better viewed as a relationship: • Name (data files) • Classify (database classes) • Meta (data dictionary) • MetaMeta (classify data dictionary )
Bottom Level -- Data Values (tuples) • Student information: • <123423, ‘John Smith’> • <234568, ‘Fred Jones’> • Module Information: • <‘CG156’, ‘Database Design’> • <‘CG065’, ‘Databases’> • Which students take which modules: • <123423, ‘CG156’> • <234568, ‘CG065’>
Naming Data Values • Student<id, name> Name {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} • Module<code, title> Name {<‘CG156’, ‘Database Design’>, <‘CG065’, ‘Databases’>} • Modules_taken<id, code>: Name {<123423, ‘CG156’>, <234568, ‘CG065’>} Name associates each value with a name
Relating Named Values to Types -- Classification • Named application data: • Schema data (types including constraints, rules) • Student<id, name> • Name • {<123423, ‘John Smith’>, <234568, ‘Fred Jones’>} Classify Student ( id integer, name char(50), primary key id, constraint ‘id format’) Classify associates each name with a type in the schema
Relating Types to Constructs - Data Dictionary • Classified schema data (types): • Constructions (e.g. table, column, function, key): Student ( id integer, name char(50), primary key id, constraint ‘id format’) Meta {Table, Column, Function, Primary Key, ...} Meta associates each schema type with a construction e.g. Student to Table, id to Column, id to Primary Key
Relating Constructs to Abstractions • Constructions (e.g. table) • Abstractions (of real-world, open-ended, concepts) {Table, Column, Function, Foreign Key, ..} MetaMeta {Aggregation, inheritance, association, …} MetaMeta associates each construction with an abstraction e.g. Table to Aggregation, Foreign Key to Association
Mappings are two-way MetaMetaPolicy Meta Organize Classify Instantiate Concepts Constructs Schema Types Named Data Values Downward arrows are intension-extension pairs
Formalising the Architecture • Requirements: • mappings within levels and across levels • bidirectional mappings • closure at top level • open-ended logic • relationships (product and coproduct) • Candidate: category theory as used in mathematics as a workspace for relating different constructions
Choice: category theory • Requirements: • mappings within levels and across levels • arrows: function, functor, natural transformation • bidirectional mappings • adjunctions • closure at top level • four levels of arrow, closed by natural transformation • open-ended logic • Heyting intuitionism • relationships (product and coproduct) • Cartesian-closed categories (like 2NF): pullback and pushout
Categories • Each level is represented by a category: • Named data values by DATA (DT) • value name • Schema types by SCHEMA (SM) • Constructions by CONSTRUCTS (CS) • Concepts by CONCEPTS (CC) Red font -- categories
Functors • Relationships between categories at adjacent levels are given by a functor • For example: • Meta: SCHEMACONSTRUCTS • Meta is afunctor Blue font -- functors
Levels in Functorial Terms MetaMeta • CONCEPTSCONSTRUCTS SystemPolicy ModelMeta InstantiateOrganize • DATASCHEMA Classify Green font - composed functors: System = MetaMeta o Meta o Classify
Composition of Adjoint Functors • Classify -- C Meta -- M • MetaMeta -- A • Policy -- P Organise -- O • Instantiate -- I • CCCSSMDT P O I A M C Composed adjunction
Adjunctions • The adjointness between two functors is given by a 4-tuple e.g. for • CCCS • <P, A, , > • unit of adjunction measures change from initial cc to cc obtained by following P and A (1CCAP(cc) ) • counit of adjunction measures PA(cs) 1cs • Unit and counit give measure of creativity of arrows and preservation of style in mapping by functors. • If complete preservation of style ( =1) and no creativity (=0) -- isomorphism. P A
Composed Adjunction for Four Levels Represents complex mappings across the levels of the system
Benefits of Approach • Can represent relationships between levels, either: • abstractly with one relationship from top to bottom levels • in much more detail with all combinations of adjoints expressed.
Comparing one System with Another CCCSSMDT CCCS´SM´DT´ P O I P´ O ´ I ´ ,, are natural transformations (comparing functors)
Godement Calculus • Rules showing: • composition of functors and natural transformations is associative • natural transformations can be composed with each other • For example: • (I´O´) = I´(O´); (OP) = (O)P • = (O) o (I´ ); = P o (O´)
Four Levels are Sufficient • In category theory: • objects are identity arrows • categories are arrows from object to object • functors are arrows from category to category • natural transformations are arrows from functor to functor • An arrow between natural transformations is a composition of natural transformations, not a new level
The Grid • Use of distributed computer power • Analogy with national power generation • Processing performed where appropriate • Requires local definitions to be used interoperably • Early stage of development
Databases on the Grid • Nearly all data sources are not formal database files yet (Paul Watson, 2002) • Data is in operating system files perhaps readable only by particular programs which understand the format • So level achieved is really only 3, that is program specification (no organization or policy).
Semantic Web • Levels: • data values to be marked up with tags • XML document -- data with XML tags • DTD -- document structure defining tags • RDF -- triples <subject, predicate, object> • components are typically URIs, could also be literals, collections of URIs or another RDF • Semantic Web -- semantic interoperability between data sources using agents to explore RDFs and ontologies
Semantic Web is rather Flat • Only handles two bottom levels of architecture • Aims to achieve comparison of mappings ( ) through: • an agent exploring RDFs • each RDF relates one URI to another URI in the context of a predicate • ontologies are used to explain and synchronise meanings of terms • An AI approach rather than a data-driven one
Discussion 1 • Category theory shows that: • four levels are ideal for interoperability • more than four yields no benefits • less than four gives only local interoperability • Categorical approach provides: • an architecture for universal interoperability • a calculus (Godement) for composing mappings at any level • adjunctions for evaluating two-way mappings
Discussion 2 • In industry: • Approach seems to be getting flatter • In MOF (Meta-Object Facility) which is a four-level system (top-level ‘hard-wired’), drive to bypass top levels • IRDS (Information Resource Dictionary System) – four levels but one-way mapping – not used as much as expected • ISO though are trying a four-level approach for mapping modelling languages
Discussion 3 • Lack of adequate foundations may explain some of the reticence • Also some suppliers perhaps think that market dominance is the answer, making interoperability irrelevant. But legacies remain! • Categorical framework provides a coherent basis for taking this area forward. Even if the framework is translated into a more familiar notation.
References 1 • MOF • Bezivin, J, From Object Composition to Model Transformation with the MDA, 3rd International Conference on Enterprise Information Systems (ICEIS), Setubal, invited paper (2001). Plus OMG web pages. • IRDS • Gradwell, D J L, The Arrival of IRDS Standards, 8th BNCOD, York 1990, Pitman 196-209 (1990). • GRID • Watson, P, Databases and the Grid, Computing Science Technical Report no.755, University of Newcastle upon Tyne (2002), (16pp). • SEMANTIC WEB: W3C web pages • Our work (available from NR’s home page) • Heather, M A, & Rossiter, B N, The Anticipatory and Systemic Adjointness of E-Science Computation on the Grid, Computing Anticipatory Systems, Proceedings CASYS`01, Liège, Dubois, D M, (ed.), AIP Conference Proceedings 627 565-574 (2002). • Rossiter, B N, Heather, M A, & Nelson, D A, A Universal Technique for Relating Heterogeneous Data Models, 3rd International Conference on Enterprise Information Systems (ICEIS), Setúbal, I 96-103 (2001). • Heather, M A, & Rossiter, B N, Constructing Standards for Cross-Platform Operation, Software Quality Journal, 7(2) 131-140 (1998).
References 2 • ISO modelling: • Study Report on the Feasibility of Mapping Modelling Languages for Analysis and Design Models.ISO/IEC JTC1/SC7 N2112, 1999/04/19, http://www.info.uqam.ca/Labo\_Recherche/Lrgl/sc7/N2101-N2150/07n2112.pdf (1999). • Category Theory and Computing Science: • Barr, M, & Wells, C, Category Theory for Computing Science, Prentice-Hall (1990). • Mac Lane, S, Categories for the Working Mathematician, Springer, 2nd ed (1998). • Category Theory and Information Systems: some other workers • Zinovy Diskin (USA, formerly Latvia) • Boris Cadish (Latvia) • Robert Rosebrugh (Canada) • Michael Johnson (Australia) • Christopher Dampney (Australia) • Michael Heather (Northumbria) • David Nelson (Sunderland) • Arthur ter Hofstede (Australia, formerly Holland) • Many other workers on category theory and program semantics