250 likes | 371 Views
CURRENT RESEARCH INFORMATION SYSTEMS & TECHNOLOGIES. Introduction. Who am I? Geert Van Grootel Senior researcher : Science division, Ministry of the Flemish community. IWETO: Flemish CRIS CERIF taskgroup member euroCRIS treasurer. Structure of presentation. Introduction: Terminology
E N D
CURRENT RESEARCH INFORMATION SYSTEMS & TECHNOLOGIES
Introduction • Who am I? • Geert Van Grootel • Senior researcher : Science division, Ministry of the Flemish community. • IWETO: Flemish CRIS • CERIF taskgroup member • euroCRIS treasurer.
Structure of presentation • Introduction: Terminology • Past & present technologies • Examples of implementations • CERIF & Technology
Different organisational levels & geography a scientific discipline intra institutional between institutions Different levels of government regional, national, international, global. Different levels of system integration integrated system (ERP) intra process data capture & collection extra process Context & Integration of CRIS’s
Current researchInformation System Technologies behind CRIS’s Document stores Relational Database Managment Systems (RDBMS) Object Oriented Database Managment Systems (OODBMS) Information Retrieval systems (IR) CRIS
Document stores • Document systems • Based on Markup Languages (SGML, XML) • in extistance since the 80’s • Rise in popularity with XML behind it as semi structured database. • Querying is usually poor • query language is procedural and navigational as opposed to declarative predicates • Difficult to maintain • updating is slow when changes effect several entity instances but fast when only with one document. • Variable report capabilities: group, sum, average,...
Information Retrieval Systems • Advantages for databases with many textual attributes • via Full inverted index • very fast retrieval • very slow update • little or no structural capability ( relations between entities) • little or no reporting capability • group, sum, average,...
OODBMS • Crucial to OODBMS is the concept of objects • Data (structure view) • Methods (process view) • Messages (event view) • Any process has to be codes specifically for any object • solutions is inheritence to help reduce coding efforts • Disadvantages • performance, worse than RDBMS • poorer data representational capabilities
Pro’s Mathematically formal easy to understand standard query language (SQL) mature technology Con’s hard to represent complex objects High performance needs expert knowledge RDBMS Flexible linking relations between business objects
Technology for CRISs • Essential Building blocks • Metadata • Dictionaries, Thesauri & Ontologies • Keys & Binary Relations
Data & Metadata • Incredible amount of data but much of this data is unaccesible • What we need: • Find relevant data as information • Understand it : syntax, semantics • Understand any restrictions on its use • The key to this is METADATA
Importance Integrity control Access control Support of data Classification, valid terms Interoperability Data exchange Data access Benefits Data quality Access Understanding answers Improving queries Interoperability other CRISs other Systems MIS, RMS Bibliographic systems Scientific data Metadata
Three Kinds of Metadata view to users SCHEMA NAVIGATIONAL ASSOCIATIVE constrain it how to get it data (document)
SCHEMA METADATA viewtousers SCHEMA NAVIGATIONAL ASSOCIATIVE constrain it how to get it data (document)
Metadata Kinds: Schema • intensional description of extensional instances • database: • name • size • security authorisations • attributes: • name • type • constraints • formal logic relationship to data instances SCHEMA constrain it
ASSOCIATIVE METADATA view to users SCHEMA NAVIGATIONAL ASSOCIATIVE constrain it how to get it data (document)
Associative Metadata view to users • information for application assistance • catalog record (e.g. Dublin Core) - descriptive • content rating (e.g. PICS) - restrictive • security, privacy (cryptography, digital signatures) - restrictive • information from dictionaries, thesauri, hyperglossaries, domain ontologies - supportive • no formal logic relationship to data instances ASSOCIATIVE
NAVIGATIONAL METADATA view to users SCHEMA NAVIGATIONAL ASSOCIATIVE constrain it how to get it data (document)
NAVIGATIONAL METADATA • How to get to information resource direct • filename • DB name + navigational algorithm • DB name + predicate (query) • URL • URL + predicate (query) • or any of the above via • web indexing system (eg AltaVista, ExCite…) • local indexing system bookmarks or proxy server) NAVIGATIONAL how to get it
Metadata Collecting observed facts DATA Structuring in Context INFORMATION Inducing commonly accepted belief KNOWLEDGE INSIGHT
Technology for CRISs • Essential Building blocks • Metadata • Dictionaries, Thesauri & Ontologies • Keys & Binary Relations
ONTOLOGY • What is an Ontology • A specification of a conceptualization. • A formal description of the concepts and relationships that can exist for an agent or a community of agents • The knowledge of a domain defined in a formal declarative language • The collection of semantic definitions for a domain. • In practice a resource of terms, their definitions and their logical inter-relationships.
DOMAIN ONTOLOGY • Domain Ontology • An ontology covering a specific subject area of interest (a domain). • The set of objects that can represented can be called the “universe of discourse”. • E.g. For a project to exist it must have a startdate, a subject, a goal, a promotor and a budget • Project <- [startdate AND subject AND goal AND promotor AND budget > 0]
DOMAIN ONTOLOGY • Domain Ontologies in IT • A representation in first order logic allowing • Facts to be expressed • Relationships to be expressed • Constraints to be expressed • New facts and relationships to be deduced or induced
And so…. • Metadata is the key to • GRIDs • SEMANTIC WEB