380 likes | 705 Views
SKOS and ISO11179. just enough semantics?. caveats. SKOS is still in working draft ‘Concept’ = combination of an identifier with either a definition or a term we will need to extend the standard to capture all we want to say about administered item relationships
E N D
SKOS and ISO11179 just enough semantics?
caveats • SKOS is still in working draft • ‘Concept’ = combination of an identifier with either a definition or a term • we will need to extend the standard to capture all we want to say about administered item relationships • comments based upon my experience from my circumstances
Agenda • what is SKOS? • SKOS as a classification scheme format • SKOS instead of your default domain model • SKOS Class = Terminological_Entry. Discuss… • formal and informal semantics • conclusions
SKOS • Simple Knowledge Organisation System • ‘an area of work developing specifications and standards to support the use of knowledge organisation systems (KOS) such as thesauri, classification schemes, subject heading systems and taxonomies within the framework of the Semantic Web’ • a standard, low-cost migration path for porting existing knowledge organization systems to the Semantic Web • a light weight, intuitive language for developing and sharing new knowledge organization systems • used on its own, or in combination with formal knowledge representation languages such as the Web Ontology language (OWL).
SKOS has informal semantics • meaning derives from the association of an identifier with a label and a definition • relationships between concepts add little to the meaning of each concept – concepts are portable between concept systems • weak semantics reduce cost • weak semantics reduce unexpected effects
expressed in RDF <rdf:Description rdf:about=“&UDEF;#_1.10.10.7"> <rdf:type rdf:resource=“&SKOS;#Concept"/> <skos:inScheme rdf:resource=“&UDEF;"/> <skos:prefLabel xml:lang="en"> Product.Finish.Impact.Indicator </skos:prefLabel> <skos:broader rdf:resource=“&UDEF;#_10.10.7"/> <softeng:inDocument"> en_pr7.rdfs </softeng:inDocument> </rdf:Description>
concept schemes skos:Concept skos:ConceptScheme skos:inScheme skos:hasTopConcept lexical labels skos:prefLabel skos:altLabel skos:hiddenLabel semantic relations skos:broader skos:narrower skos:related documentation properties skos:note skos:scopeNote skos:definition skos:example skos:historyNote skos:changeNote skos:editorialNote collections skos:Collection skos:OrderedCollection skos:member skos:memberList SKOS Vocabulary
ISO/IEC 11179-2:2000 Introduction There are several purposes for applying classification to data elements. Classification assists users to find a single data element from among many data elements, facilitates data administration analysis of data elements and, through inheritance, conveys semantic content that is often only incompletely specified by other attributes, such as names and definitions. The classification schemes accommodated in this part have utility for • deriving and formulating abstract and application data elements • ensuring appropriate attribute and attribute-value inheritance • deriving names from a controlled vocabulary • disambiguating • recognizing super-ordinate, coordinate, and subordinate data element concepts • recognizing relationships among data element concepts and data elements • assisting in the development of modularly designed names and definitions
missing/problematic • no specific metadata schema – but RDF is easily extendible • identifiers? <rdf:Description rdf:about=“&cancergrid-mdr-classification;"> <cgMDR:administrationRecord rdf:resource="GB-CANCERGRID-000026-1.0"/> <skos:hasTopConcept rdf:resource= "&cancergrid-mdr-classification;#root"/> <rdf:type rdf:resource=“&skosCore;#ConceptScheme"/> </rdf:Description>
so what? • simplification of development: reuse Semantic Web machinery • third party skos editors • Jena, SPARQL, Protégé, Pellet, existing terminologies/ontologies… • XQuery libraries, Java libraries • classification is a community thing • offer users a personal classification facility • ingest personal classifications into the officially supported schemes • exchange schemes between components of your knowledge framework
Linnaean Classification of Pigs Kingdom Animalia • Phylum Chordata • Class Mammalia • ORDER Artiodactyla: Even-toed ungulates • SUBORDER Suiformes: pigs, hippos • Family suidae • Sus scrofa – Wild boar • Sus babyrousa – Indonesian pig • Sus hylochoerus – Giant forest hog • …. • SUBORDER Tylopoda: camels, llamas • SUBORDER Ruminantia: ruminants
those that belong to the Emperor embalmed ones those that are trained suckling pigs mermaids fabulous ones stray dogs those that are included in this classification those that tremble as if they were mad innumerable ones those drawn with a very fine camel’s hair brush others those that have just broken a flower vase those that resemble flies from a distance Borgean Classification of Pigs Source: Jorge Luis Borges, Funes the Memorious via Lincoln D Stein SOFG 2003
SKOS in the cancergrid MDR • demo of MDR • Excel designer access to classification scheme • exchange of classification schemes
modelling • object class, property, value meaning, and conceptual domain can be used to construct an internal registry model • further ordering of data elements • associate each class with an expression from a terminology, ontology or UML domain model • analyse annotations to determine similar/matching data elements • a privileged classification scheme with specific semantics
examples of ontologies • OCRe core • OCRe clinical study type • UK IPSV • some easier for a registrar to use… • some more suited to modelling than others…
SKOS can do that • detecting data element equivalence arises from consistent annotation, not the rigour of the modelling language • good ontologies can be difficult to use • opaque modelling constructions • remodelled terminology • limited relationship types/highly specialised relationship types • equivalence cannot be automatically processed: so strong semantics are not required • SKOS relationships organise rather than define • require less deliberation • designed to support annotation (hidden labels!)
UDEF • an ontology of object classes and properties • relationships expressed as rdfs:subClassOf • intended to facilitate (at least to some degree) of discovery and interoperability across registries
3.4 rdfs:subClassOf The property rdfs:subClassOf is an instance of rdf:Property that is used to state that all the instances of one class are instances of another. A triple of the form: C1 rdfs:subClassOf C2 states that C1 is an instance of rdfs:Class, C2 is an instance of rdfs:Class and C1 is a subclass of C2. The rdfs:subClassOf property is transitive. The rdfs:domain of rdfs:subClassOf is rdfs:Class. The rdfs:range of rdfs:subClassOf is rdfs:Class
observations • OC to OC relationships are an extension to the standard • the relationships between data, DE, DEC, OC and Property have no standard rdf/rdfs semantics • surely a data element rdfs:isDefinedBy the combination of the conceptual and representational elements • how much value does the precision of this relationship add? • … if there are no clear rdf semantics for ISO11179 • did you really mean it? – because your reasoner will believe you… • limited support for the annotator (mitigated by close scope) • why not implement UDEF in SKOS?
a skos:concept is an identifier and either a label or a definition a terminological entry is a name or definition within a language associated with an administration record which contains an identifier a value meaning is an identifier with a definition spot the difference
SKOS for administered item relationship semantics • default semantics for ‘administered_item_relationship_type_description? • certainly appropriate for the conceptual domain and the data element concept relationships because they’re just privileged classifications anyway • one might use value domain relationships for inheritance cf. xs:simpleType • look very similar to the suggested classification scheme relationships
a SKOS view of your MDR? • if you have SKOS relationships between components of your MDR, then you could • extract/exchange the terminology of your MDR: particularly for the CD, DEC and VM • visualise relationships between classes using SKOS editors improving your curation – possibly round tripping the changes
polemic • you can’t rely on users to consult formal, logical definitions of data they are collecting: the definitions of their activities are informal • if this is true, the formal relationships can only ever be asserted after the fact – they analyse and explain the data users have created from a particular perspective • reality is always right – formal relationships are not primary semantics • the more useful the model, the more limited its application • models are expensive and divisive • no one model is enough • SKOS: just enough semantics to get the job done in the current version of ISO11179-3