190 likes | 295 Views
The CERIF-2000 and Vocabularies. Andrei Lopatenko Vienna University of Technology http://derpi.tuwien.ac.at/~andrei . Problems. Examples. The CERIF-2000 Standard Publications types. 'abstract','bibliography','biography','book','book chapter',
E N D
The CERIF-2000 and Vocabularies Andrei Lopatenko Vienna University of Technology http://derpi.tuwien.ac.at/~andrei
Problems. Examples. The CERIF-2000 Standard Publications types • 'abstract','bibliography','biography','book','book chapter', • 'conference paper','conference proceedings','correspondence', • 'dictionary','directory','dissertation','duplicate publication', • 'editorial','encyclopedia','errata','guideline','index','interview', • 'journal article','lecture', 'meta-analysis','miscellaneous', • 'monograph','multimedia','news','overall','patent','report','review' • , 'standard','textbook','translation','twin study'
Problems. Examples. UiB (real CRIS) publication types • http://www.ub.uib.no/avdeling/fdok/demo/UK2001/classifikasjon.htm • Two level hierarchy • The first level. Seven terms • The second level. About 50 terms • The same situation in Austria • Vocabularies dictated by real requirement to the CRIS from researchers, policy-makers, university administration
Research topics, expertise skills • In the CERIF-2000 ORTELIUS and others vocabularies are suggested to use • Real needs of researchers – AMS, national vocabularies and others • And what to do?
Why different vocabularies for the same data • Different societies and policies, focusing on different aspects or interested in different details • Different states by historical and other reasons
When vocabulary used? • In information retrieval operations • Search by vocabulary terms • Browse by hierarchy • Reports, analysis, evaluation, visualization (usually some statistical calculations) • Information input
What we need • As usual every CRIS has own target audience. That audience defines vocabulary used • But some CRIS should be compatible. Data should be able to be exported to EU bodies and problem of “domains” – in this case vocabularies should be solved. • The CERIF-2000 should specify how to use custom vocabulary for CRIS audience need and remain compatible with other CERIF CRIS (future, ERIS network?)
Other advantages • Such framework also will be useful for some CRIS with data sharing. • Some CRIS has several different audiences. And each uses own vocabulary for the same data. Creating multi vocabulary CRIS can be implementation guideline-help for CRIS developers • Example: publication types – researchers and university administration (only statistical weights of publications), national and international audience foe the same CRIS
So what we need • Framework and application which • Built, evolve and store different vocabularies of different types • Specify meaning of vocabulary for classifying which information it can be used • Specify intervocabulary relations (mappings)
What we need • Framework and application which(operational stage) • Perform domain transformation (vocabulary mapping) in export/import operations - short term goal • Perform domain transformation (vocabulary mapping) in informational retrieval operations
How to implement • Several possible ways • Simple database descriptions of vocabularies and their mappings • More complicated, as example, Description Logics descriptions of the vocabularies and mappings
Database implementation • Easy to implement • Easy to develop vocabulary creation, supporting, mapping definition tools • More or less easy to implement transformation tools • But very bad in expressing meaning of terms, terms definitions
Database implementation • Impossible to specify attribute definition of terms • Example: “EU Project” is a “Project” which “funding organization” is a “fund” of EU or Euro Commission • Hard to measure information lacks…
Database implementation • But already implemented and can be part of CERIF-2000 free implementation • Currently MS ASP application, better JSP or more open standard
Description Logic • Much more powerful in taxonomies and mapping descriptions • But need very special experience and knowledge (some logical theories, languages) • Not directly database integrated, as usual terms are not stored in database
Description Logics. • DAML + OIL can make situation not so hard? • Possibly, yes, should be investigated in EU project • Very important for European wide CRIS
Summary • We can propose free tools for CERIF CRIS developers • And when we get feedback situation will be more clear • Need to get EU project and solve vocabulary problems