300 likes | 442 Views
WI 4 ( CWA1) : Guidelines for machine-processable representation of Dublin Core Application Profiles. Pete Johnston, UKOLN, University of Bath Thomas Baker, Fraunhofer-Gesellschaft CEN/ISSS MMI-DC Meeting Brussels, 22-23 September 2004. http://www.ukoln.ac.uk/.
E N D
WI 4 (CWA1): Guidelines for machine-processable representation of Dublin Core Application Profiles Pete Johnston, UKOLN, University of Bath Thomas Baker, Fraunhofer-Gesellschaft CEN/ISSS MMI-DC Meeting Brussels, 22-23 September 2004 http://www.ukoln.ac.uk/
Machine-processable representation of Dublin Core Application Profiles • Context • Conceptual model for DCAP • Suggested representation using RDF
Context • Metadata “application profile” • Recognition that implementers adapt metadata standards to context • Use terms from multiple metadata vocabularies in combination • CEN CWA 14855 • Guidelines for human-readable representation of DCAP • Current draft • Make information available in structured form, usable by applications • Influenced by • DCMI practice ("Grammatical Principles“, Namespace Policy, declaration of metadata vocabularies, “DCMI Abstract Mode”l) • W3C Semantic Web activity • Research projects on metadata schema registries
DCMI Abstract Model • Working Draft of DC Architecture WG • Seeks to make explicit the DC "meta-model" • what are the component parts of any DC metadata description • what information these components convey about the resources described by the DC metadata description • independent of form in which DC metadata description is represented • closely aligned with RDF meta-model • adopts class hierarchy/property specialisation semantics of RDFS
DCMI Abstract Model • Description as set of statements about a subject resource • Each statement describes a relationship between the subject resource and a second resource (value) Ref to resource Statement Ref to property Ref to value Description
Fundamentals of DCAP • DCAP does not define new terms • DCAP references ("uses") terms already defined elsewhere • Terms may be from multiple independently-created sources • DCAP may describe how use of terms is “constrained”, adapted, contextualised • N.B. CWA 14855 employed "term usage"; current doc employs "property usage“ • DC differentiates different types of “term” • Only use of properties is “constrained” • Other types of term are referenced • Only (or at least primarily) as part of constraints on property
Conceptual model for DCAP • What is a DCAP? • What are the component entities? What are the related entities? • What are the attributes of a DCAP? And of these component and related entities? • What types of relationship exist between these entities?
1 administers 1 isDescribedIn 1 m Agency DCAP SchemaDocument 1 1 1 administers hasPropertyUsage isExpressedBy m m m 1 m 1 m hasTerm usesProperty BindingSchema MetadataVocabulary Property PropertyUsage 1 1 m isDescribedIn hasTerm usesAsEncodingScheme 1 n m SchemaDocument Class
1 1 1 administers m isDescribedIn SchemaDocument Agency 1 1 1 hasTerm hasTerm m m m m subclass subprop Property Class n n m isDescribedIn type n m Instance MetadataVocabulary
Metadata Vocabulary A set of metadata terms (Properties, Classes, and Instances of those classes) managed as a coherent unit by an Agency Examples: the DCMES, the DC Terms Vocabulary, the DCMI Type Vocabulary
Property A Property is a type of relationship between two Resources. A Property is declared as a term within exactly one Metadata Vocabulary. A Property may be related to another property by a sub-property relationship: this states that all resources related by the first property are also related by the second property Examples: dc:creator, dcterms:modified, dcterms:audience (All DCMI elements and element refinements are properties.)
Class A Class is a group of resources. A Class is declared as a term within exactly one Metadata Vocabulary. A Class may be related to another class by a sub-class relationship: this states that all instances of the first Class are also instances of the second Class. A Resource is related to one or more Classes by a type relationship, and is said to be an Instance of those classes . Examples: dcterms:LCSH, dcterms:W3CDTF, dcmitype:Text, dcmitype:Collection (All DCMI "encoding schemes" and type vocabulary terms are classes.)
1 1 1 administers m isDescribedIn SchemaDocument Agency 1 1 hasPropertyUsage isExpressedBy m m PropertyUsage BindingSchema m 1 usesAsEncodingScheme usesProperty m n Class Property DCAP
DC Application Profile (DCAP) A set of Property Usages, created to meet the functional requirements of an application or context, and managed as a coherent unit by an Agency. Examples: the Simple Dublin Core DCAP, the RDN-DC DCAP, the Renardus DCAP
Property Usage A Property Usage is a description of how a previously declared Property from a Metadata Vocabulary is deployed in the context of an application. A Property Usage • must reference ("use") exactly one Property. • may provide additional documentation on how the property is interpreted in the context of this application • may provide an application-specific label for the property • may specify obligation for the use of statements referring to the property (whether it is mandatory, optional, conditional) • may specify constraints on the occurrence of statements referring to the property • may specify constraints on the permitted values of the property, by specifying that they are instances of specified classes (i.e. may specify "encoding schemes" for the property)
Property Usage Examples: the usage of dc:title in the Simple Dublin Core DCAP, the usage of dc:title in RDN-DC DCAP, the usage of dc:title in Renardus DCAP
Representation of DCAP : XML? • Could provide a XML DTD or XML Schema to define an XML format for a DCAP • But the property usages in a DCAP reference existing terms • Term descriptions already available, using RDF/RDFS (in some cases, at least!) • Would require • re-describing terms that are already described (or map existing data to new format); or • using separate format/model for DCAP (DCAP-XML) and for metadata vocabulary (RDFS/RDF)
Representation of DCAP: XML Schema? • An XML Schema describes constraints on the structure of a (class of) XML document(s) • Abstract Model • Description may be represented as records in multiple syntaxes • May be multiple XML formats, each with different XML Schema • A DCAP specifies the properties/classes used in a description • So (potentially) one-to-many relation between DCAP and XML Schema
Representation of DCAP: XML Schema? • However, XML implementers want • to constrain structure of DC-in-XML documents during creation • to validate structure of DC-in-XML documents post-creation • … so need XML Schema corresponding to DCAP (for their chosen XML format) • DCAP model is (probably?!) rich enough to generate XML Schema… • …but N.B. that generation process requires additional information about each XML format
Representation of DCAP: RDF? • RDF provides simple meta-model • Resource-property-value • Descriptions of terms in DCMI metadata vocabularies already published using RDF • using RDFS and DC vocabularies • Many other significant vocabularies also available currently or will be available • By definition DCAP references other terms • Use of RDF facilitates merging of DCAP description and existing metadata vocabulary descriptions (and resource descriptions)
Representation of DCAP: RDF? • However, DCAP concept is closely associated with that of document/record/bounded description • mandating that statement with specified property is present • limiting number of occurrences of statements with specified property • mandating that value of specified property is instance of specified class • Generally, RDF applications tend to adopt "open-world" assumptions • RDFS, OWL designed to support inferencing, rather than completeness/correctness checks (validation)
RDF representation • Specify RDF classes and properties corresponding to the entity types, attributes, & relation types in model • Use existing RDF vocabularies where possible • RDF Vocabulary Description Language (RDF Schema) provides • a semantics of class hierarchy/property specialisation • an RDF vocabulary to represent RDFS semantics • i.e. properties and classes to describe Properties, Classes (and Datatypes) • DCMES/DC terms provide • properties for many descriptive attributes
RDF representation • RDFS has no concepts of application profile, property usage • RDFS does not provide • a class to represent a (Metadata) Vocabulary • So need to provide additional classes and properties where required • The “dcap” vocabulary • Should provide RDFS descriptions of dcap: terms • N.B. No URIrefs yet assigned for dcap: terms
Example • RDN-DC • DCAP used for record-sharing between partners in Resource Discovery Network (RDN) • Sharing over OAI-PMH, so uses XML syntax • Usage of dc:language • Optional (recommended) • Repeatable • Requires use of RFC3066 encoding scheme
rdf:Property dcap:DCAP rdf:type RDN-DC rdf:type dc:language dc:title rdn:rdn-dc-dcap dcap:uses rdfs:Class dcap:isMemberOf rdf:type dcap:encodingScheme dcterms:RFC3066 rdf:type dcap:PropertyUsage
<dcap:PropertyUsage> <dcap:uses rdf:resource="&dcns;language"/> <dc:description>Use the language codes defined in RFC 3066.</dc:description> <dcap:obligation rdf:resource="&dcapns;Obligation/recommended"/> <dcap:maxOccurs>Unbounded</dcap:maxOccurs> <dcap:encodingScheme rdf:resource="&dctermsns;RFC3066" /> <dcap:isMemberOf rdf:resource="http://www.rdn.ac.uk/ap/rdn_dc"/> </dcap:PropertyUsage>
Issues • Choice of URIrefs for dcap: RDF vocabulary terms • Currently, no DCMI-endorsed model for DCAP • Proposed model is largely untested! • But JISC IEMSR registry in development (similar data model) • DCMI Abstract Model still work-in-progress • Literal and non-literal values in DC metadata? • Use of literal datatyping for syntax encoding schemes? • DCAP for description v DCAP for description set • CEN CWA 14855 • more "permissive" view of DCAP?