750 likes | 902 Views
Outline. What is CERIF? Grounding Explanations Model Metadata Data-centric Research Information CRIS The Conceptual (Logical) CERIF Model Entities Relationships Structure CERIF Examples and Related Activities. What is CERIF ?. C ommon E uropean R esearch I nformation F ormat
E N D
Outline What is CERIF? Grounding Explanations Model Metadata Data-centric Research Information CRIS The Conceptual (Logical) CERIF Model Entities Relationships Structure CERIF Examples and Related Activities
What is CERIF ? Common European Research Information Format A Concept about Research Entities and their RelationshipsSpecification (Conceptual Level) A Description of Research Entities and their Relationships Model (Logical Level) A Formalization of Research Entities and their Relationships Database Scripts (Physical Level) SQL Script ----------------------- CREATE Table Person CREATE Table Project CREATE Table OrgUnit
What is CERIF ? Common European Research Information Format data model (data-centric) allows for a (metadata) representation of research entities their activities / interconnections (research) their output (results) allows for high flexibility with formal (semantic) relationships enables quality maintenance, archiving, access and interchange of research information supports knowledge transfer to decision makers, for research evaluation, research managers, strategists, researchers, editors, the general public
What is CERIF ? CommonEuropeanResearchInformation Format CERIF is an EU Recommendation to Member States http://cordis.europa.eu/cerif/ The European Commission (EC) has authorised euroCRIS to maintain and develop CERIF and its usagehttp://www.eurocris.org/cerif/cerif-releases/
CERIF A Template • CRIS can be implemented using subset or superset of full CERIF model: • for projects • for people • for organisations • for publications, patents , products • for services • for facilities, particular equipment • with role-based, temporally-bound relationships
CERIFThe Advantages • Neutral Architecture • Data Model can be implemented: • relational • object-oriented • information retrieval (including WWW) • Process model can be implemented • DBMS and query; centralised or distributed; • html web / harvesting / IR-query; • advanced knowledge-based technology • But interoperation requires structured schema
CERIF developed beyond 2000 • Revisions to CERIF2000 standard • CERIF2002, CERIF2004, 2006, 2008 • Issues • Publications • Classification (& semantics) • Custodians of the model • Required some organisation • EC handed responsibility to euroCRIS (2002) • euroCRIS set up CERIF Task Group
The CERIF Evolution CERIF 2006 / 2008 Model Similar Ideas UN/UNESCO OECD CODATA CORE Link Semantics Language 2ndLevel EU Working Group on Research Databases Workshop CERIF 2000 Model Roles EXPERTISE OrgUnit PERSON CERIF 91 PROJECT RESULTS EQUIPMENT PROJECT CLASSIFICATION Acronym: ERGO Participant: Keith Jeffery, Anne Asser son, many more Organisations: Rutherford Appleton, Uni- versity of Bergen, … • - Data Model (RDBMS, OO, IR) • Model Normalization • - Robust Structure • - Extensible Structure • - Consistent Structure • - Semantic Layer • XML Exchange Specification • Connectivity to Repositories (Elaboration on Publication) • - Data Model (RDBMS, OO, IR) • - Multilinguality • Controlled Vocabulary • Roles / Types • User-driven • EC Recommendation to Member States • - Networking of DBs • Exchange of Records • Recommendation to Member States 1987 1991 2000 2006 2008
CERIF The use today • As a model for an implemented standalone CRIS • But interoperation ready • As a model to define the wrapper around a legacy non-CERIF CRIS • To allow homogeneous access to heterogeneous systems • As a definition of a data exchange format • To create a common data warehouse from several CRIS
Outline What is CERIF? Grounding Explanations Model Metadata Data-centric Research Information CRIS The Conceptual (Logical) CERIF Model Entities Relationships Structure The CERIF (XML) Interchange Format The CERIF 2008 Release CERIF and Related Activities
What is a model ? … is a simplified view to describe a particular area of interest … allows for a better communication between parties (mutual understanding) … supports (re-)design decisions … supports workflow identification … supports documentation … can be exchanged, re-used, iterated, extended is part of informs A C B D SQL Script ------------------- CREATE Table Person CREATE Table Project CREATE Table OrgUnit depends on X Z waits for F G
CERIF: The Key • This allows not only construction of a new interoperation-ready CRIS • but also wrapper-interoperation by generatingCERIF from a legacy CRIS • The key to the CERIF datamodel is • Structured (syntax) • First order logic (semantics) New CRIS Legacy CRIS wrapper
The C E R I F Model Funding Programme Organisation Organisation Person Person Project Project Service Skills Publication Equipment CV Patent Classification Classification Product ( ( ) ) Semantics Semantics Event CommonEuropeanResearchInformationFormat
„Metadata is structured data which describes the characteristics of a resource.” An Introduction to Metadata, by Chris Taylor, University of Queensland “Metadata is sometimes defined literally as 'data about data,' but the term is normally understood to mean structured data about resources that can be used to help support a wide range of operations. These might include, for example, resource description and discovery, the management of informationresources and their long-term preservation.” Metadata in a Nutshell, by Michael Day, UKOLN What is Metadata ? Support for a wide range of operations …
What is Metadata ? Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Book: Title: The Hitchhiker‘s Guide to the Galaxy Date of Publication: 1979 Radio Series: Title: The Hitchhiker‘s Guide to the Galaxy Description: is a science fiction comedy series created by Douglas Adams. Originally a radio comedy broadcast on BBC Radio 4 in 1978, […] Source: Wikipedia Date of Query: May 30, 2008 Series of five Books: Title: The Hitchhiker‘s Guide to the Galaxy. Between: 1979 - 1982 • Structure: • Type of Resource • Title • Description • Source • Date • Author, Creator, … TV Series: Title: The Hitchhiker‘s Guide to the Galaxy Screened: 1981 Data about Data Game Cover Image: The Hitchhiker‘s Guide to the Galaxy Source:http://egotron.com/ Retrieved: May 30, 2008 Computer Game: Title: The Hitchhiker‘s Guide to the Galaxy Released: 1984 Links: http://www.bbc.co.uk/cult/hitchhikers/ HTML-Title: Cult – The Hitchhiker‘s Guide to the Galaxy http://en.wikipedia.org/wiki/The_Hitchhiker's_Guide_to_the_Galaxy HTML-Title:The Hitchhiker's Guide to the Galaxy Comic Book Adaptions: Title: The Hitchhiker‘s Guide to the Galaxy Between: 1993 – 1996
Metadata Categories Descriptive Metadata [intellectual contents] Administrative Metadata Technical [file formats ...] Rights Management [permissions ...] Provenance [creation, subsequent treatment, ...] ... Structural Metadata [internal structure of items: page order ...] Contextual Metadata Project Context [funding programme, participating organisations …] Publication Context [number of authors, external authors, first …] Usage Context [downloads, requests, …] ... See also: JISC Report from April 2008 “Metadata for digital libraries: state of the art and future directions” by Richard Gartnerhttp://www.jisc.ac.uk/media/documents/techwatch/tsw_0801pdf.pdf What is Metadata ? Support for a wide range of operations …
Metadata Categories Descriptive Metadata [intellectual contents] Administrative Metadata Technical [file formats ...] Rights Management [permissions ...] Provenance [creation, subsequent treatment, ...] ... Structural Metadata [internal structure of items: page order ...] Contextual Metadata Project Context [funding programme, participating organisations …] Publication Context [number of authors, external authors, first …] Usage Context [downloads, requests, …] ... What is FormalMetadata ? Support for a wide range of operations … Formalization = based on a Model
What is Data-centric ? Publication URI: Type: Title: PartOf: PublDate: Organisation URI: Name: Abbreviation: Publications: Academic Staff: Journal Publications 2007 Institute A = 4 Institute B = 10 Institute C = 9 Article Requests 2007 Journal X = 4 Journal Y = 0 Journal Z = 15 Data Metadata PhD Students 2008 Computer Science = 200 Physics = 50 Social Sciences = 9 Journal Subscriptions Journal X = 1990 - 2000 Journal Y = 2005 - 2010 Journal Z = 2001 - 2010 Organisation URI: Name: hasAccess: EndOfAccess ContactPerson: Ends in 2010 Journals: Y, Z First Author / No of Papers Person H = 10/35 Person I = 4/12 Person J = 1/10 Citations in 2007 Paper M (publish 2007) = 20 Paper N (publish 2004) = 100 Paper O (publish 2001) = 0 CitationTypes Type: Description:
What is Data-centric ? Metadata / Data in the center Data Maintenance, Curation, Preservation and Quality a major Interest Enabling added-value Services based on qualitative Data Enabling requested views for various stakeholders based on qualitative Data Publication URI: Type: Title: PartOf: PublDate: Organisation URI: Name: Abbreviation: Publications: Academic Staff: Journal Publications 2007 Institute A = 4 Institute B = 10 Institute C = 9 Article Requests 2007 Journal X = 4 Journal Y = 0 Journal Z = 15 Data Metadata PhD Students 2008 Computer Science = 200 Physics = 50 Social Sciences = 9 Journal Subscriptions Journal X = 1990 - 2000 Journal Y = 2005 - 2010 Journal Z = 2001 - 2010 Organisation URI: Name: hasAccess: EndOfAccess ContactPerson: Ends in 2010 Journals: Y, Z First Author / No of Papers Person H = 10/35 Person I = 4/12 Person J = 1/10 Citations in 2007 Paper M (publish 2007) = 20 Paper N (publish 2004) = 100 Paper O (publish 2001) = 0 CitationTypes Type: Description:
What is Research Information ? Data/Metadata or Information about: Scientists Project Managers Ongoing and Completed Projects Research Departments Funding Organisations and Programmes Research Results Publications Equipment their timely Relationships (Semantics) ...
What is a CRIS? Current Research Information System = CRIS • … that means • Timeliness • Vitality • … information about • People + • Organisations + • Projects + • Funding Programmes + • Research Results + • … • … driven by • A Concept • A Model • … incorporated as a • Implementation (ICT) an integrated approach towards managing research information
What is a CRIS? Current Research Information System = CRIS CERIF Metadata • … that means • Timeliness • Vitality • … information about • People + • Organisations + • Projects + • Funding Programmes + • Research Results + • … • … driven by • A Concept • A Model • … incorporated as a • Implementation (ICT) heterogenous entities changing relationships Integration an integrated approach towards managing research information
Users of CRISs ? Researchers (find partners, track competitors, form collaborations) Research Managers (assess performance, assess research output, find reviewers for evaluation of proposals) Research Strategists (decide on priorities and resourcing, compare with other countries) Publication Editors (find potential authors, find reviewers for proposed papers) Intermediaries / Brokers (find research products, identify ideas to be carried forward) Media (communicate results) General Public (for interest)
Users of CRISs ? Researchers (find partners, track competitors, form collaborations) Research Managers (assess performance, assess research output, find reviewers for evaluation of proposals) Research Strategists (decide on priorities and resourcing, compare with other countries) Publication Editors (find potential authors, find reviewers for proposed papers) Intermediaries / Brokers (find research products, identify ideas to be carried forward) Media (communicate results) General Public (for interest) Research is International Research Information involves various Entities
What kind of Questions do we want to answer? How many articles has author X published in 2007 as a first author? How often have articles by author X been cited? Did author X publish with institutionally external authors? In how many FP7 projects does organisation Z participate? How many publications have resulted from project Y? How many people have been employed in the course of FP6 projects from the 1st call in the NMS? How many PhD students have participated in FP6 projects? How many women have been involved in FP6 projects? How often have articles in journal A been requested in 2007? How many articles have been published in the field of B? …
Outline What is CERIF? Grounding Explanations Model Metadata Data-centric Research Information CRIS The Conceptual (Logical) CERIF Model Entities Relationships Structure The CERIF (XML) Interchange Format The CERIF 2008 Release CERIF Examples and Related Activities
CERIF: Common European Research Information Format Funding Programme Organisation Organisation Person Person Project Project Service Skills Publication Equipment CV Patent Classification Classification Product ( ( ) ) Semantics Semantics Event
Concept of the CERIF Model CERIF: A model to manage Research Information Research Entities Project, Person, Organisation Funding Programme, Service, Equipment, Publication, Patent, Product, … Activities / Interconnections in their Context Relationships Semantics / Roles / Types -> for Exchange -> for Interoperability -> for Implementation of CRISs (Current Research Information Systems)
CERIF Characteristics • extensible while preserving backward continuity to allow guaranteed interoperation between CERIF-CRIS • by adding new base entities and then link entities to integrate with the structure. • link to any other system • using the link entities. • normalized to avoid replication of data and to improve performance. • and consequent update integrity problems
CERIF Characteristics • implementable using any technology from hypermedia to information retrieval (semi-structured) and on to knowledge-based systems. • follows formally first order logic • and so is available for deduction and induction leading to greater potential utilization of the data • Is scalable because machine-understandable as well as machine-readable. • includes lookup tables (used also as classification tables) • improved data integrity by validation at input/update time • permits intelligent user interfaces to utilise the information to provide user assistance.
Concept of the CERIF Model - Structure CERIF Entity Types Core Entities Result Entities 2nd Level Entities Link Entities CERIF Features Multiple Language Semantics
Core CERIF Entities in Detail Person ID URI Sex FirstNames OtherNames FamilyNames NameVariants ResearchInterest Keywords Project ID URI Acronym StartDate EndDate Title Abstract Keywords OrganisationUnit ID URI Acronym Name HeadCount CurrencyCode Turnover ResearchActivity Keywords
CERIF Result Entities in Detail ResultPublication ID URI Title Subtitle Abstract Bibl. Note PublicationDate TotalPages StartPage EndPage Keywords ResultPatent ID URI PatentNumber Title CountryCode RegistrationDate ApprovalDate Description Keywords ResultProduct ID URI InternationalID
CERIF 2nd Level Entities Facility Equipment Funding ExpertiseAndSkills Service Qualification ElectronicAddresse Prize PostalAddress CV Country Citation Currency Metrics Event Language
Some CERIF 2nd Level Entities in Detail FundingProgramme ID URI Name CurrencyCode Budget StartDate EndDate Description Keywords ResultPatent ID URI PatentNumber Title CountryCode RegistrationDate ApprovalDate Description Keywords Event ID URI Name FeeOrFree StartDate EndDate CityTown CountryCode Description Keywords Facility ID URI Name Description Keywords Facility Equipment Funding ExpertiseAndSkills Service Qualification ElectronicAddresse Prize PostalAddress CV Country Service ID URI Name Description Keywords Citation Currency Metrics Event Language
Some CERIF Multiple Language Features in Detail OrganisationUnit Name [language] ResearchActivity [languange] Keywords [language] ResultPublication Title [language] Abstract [languange] Keywords [language] ResultPatent Name [language] Description [languange] Keywords [language] ResultProduct Name [language] Description [languange] Keywords [language] Service Name [language] Description [languange] Keywords [language] Facility Name [language] Description [languange] Keywords [language] Person ResearchInterest [language] Keywords [language] Project Title [language] Abstract [languange] Keywords [language] Multiple Language Features are associated with Core, Result, 2nd Level, Classification Entities
Some CERIF Semantic Features role=author1 institute role=author role=deliverable1.2 role=CEO role=funder role=coordinator Semantic Features are associated with Link Entities
Associated Semantic Features in more Detail OrganisationUnit_Result Publication orgID publID Classification ClassificationScheme StartDate; EndDate Person_ResultPublication persID publID Classification ClassificationScheme StartDate; EndDate role=author role=author1 institute Project_ResultPublication persID publID Classification ClassificationScheme StartDate; EndDate Project_FundingProgramme projID fundProgID Classification ClassificationScheme StartDate; EndDate role=originator role=co-funder Project_Person projID perslID Classification ClassificationScheme StartDate; EndDate Person_OrganisationUnit persID orgID Classification ClassificationScheme StartDate; EndDate Project_Organisation projID orgID Classification ClassificationScheme StartDate; EndDate role=coordinator role=investigatedBy role=affiliation
Associated Formal Semantic Features in more Detail OrganisationUnit_Result Publication orgID publID Classification ClassificationScheme StartDate; EndDate CERIF Model Person_ResultPublication persID publID Classification ClassificationScheme StartDate; EndDate role=author role=author1 institute Project_ResultPublication persID publID Classification ClassificationScheme StartDate; EndDate Project_FundingProgramme projID fundProgID Classification ClassificationScheme StartDate; EndDate role=originator role=co-funder Project_Person projID perslID Classification ClassificationScheme StartDate; EndDate Person_OrganisationUnit persID orgID Classification ClassificationScheme StartDate; EndDate Project_Organisation projID orgID Classification ClassificationScheme StartDate; EndDate role=coordinator role=investigatedBy role=affiliation
CERIF Semantic Layer ClassificationScheme ClassSchemeID Description [language] URI Classification ClassID ClassSchemeID Term [language] Description [language] StartDate, EndDate URI Classification_Classification ClassID1 (Term1) ClassID2 (Term2) ClassSchemeID1 (Schema1) ClassSchemeID2 (Schema1) ClassId (Role) ClassSchemeID (RoleSchema) StartDate, EndDate ClassScheme_ClassScheme ClassSchemeID1 ClassSchemeID2 ClassID (Role) ClassSchemeID (RoleSchema) StartDate, EndDate
CERIF Semantic Layer Allows to capture any Schema or Structure Flat Lists Taxonomies Ontologies Open / Extensible in all directions New Schemas New Concepts / Terms New Relationships Enables to manage Roles / Types Semantics Subject Headings Archiving (Time component) Allows for simple Mappings between Schemas (Interchange) Allows for an efficient (independent) Maintenance
CERIF 2nd Level Entities (ERM View) Facility Equipment Funding ExpertiseAndSkills Service Qualification ElectronicAddresse Prize PostalAddress CV Country Citation Currency Metrics Event Language