1 / 12

Possible solutions

Possible solutions. Menzo.Windhouwer@mpi.nl. Which data category to use?. If it fits your needs try to pick one which is bound for standardization: Metadata: owned by Peter Wittenburg or Daan Broeder Morphosyntax : owned by Gil Francapoulo Terminology: owned by Sue Ellen Wright

hestia
Download Presentation

Possible solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Possible solutions Menzo.Windhouwer@mpi.nl

  2. Which data category to use? • If it fits your needs try to pick one which is bound for standardization: • Metadata: owned by Peter Wittenburg or Daan Broeder • Morphosyntax: owned by Gil Francapoulo • Terminology: owned by Sue Ellen Wright • If they are close to your needs you can now still contact these owners to discuss modifications • Once standardized you’ll have to issue a Change Request • Or pick one published by a group (maybe you can join the group or form a group, so you collectively maintain the DC(S)) • Create your own • Add a narrower/broader relationship to the RR CLARIN-NL - Call 1 - ISOcat status

  3. Reminder: Data category types complex: open constrained closed writtenForm grammaticalGender email string string string Constraint: .+@.+ neuter feminine masculine simple: CLARIN-NL - Call 1 - ISOcat status

  4. Which data category types? • TDGs give DC types based on some reference model: • Metadata: CMDI • Morphosyntax: LMF • Terminology: TBX • POS field (closed DC) of the lexical entry “walk” gets the value ‘verb’ (simple DC) • If the DC type doesn’t fit your needs: • Verb (open DC) feature of a feature structure gets the value “walk” • Unfortunately the DCR data model hasn’t yet facilities to share a semantic core between various types • Create your own and add a sameAs relationship to the RR CLARIN-NL - Call 1 - ISOcat status

  5. Data category value domains • Is the value you need not yet known? • Contact owner to add your simple DC • Create your own closed DC, reuse existing simple DCs, add your own simple DC(s) • Add a sameAs relationship to the RR • It’s easy to create subset of a value domain, but not to create a superset … CLARIN-NL - Call 1 - ISOcat status

  6. Granularity issues • How much (application) context to take into account? • Generic data categories (context insensitive): • Pro: reusable; the data model of your application provides the proper context, i.e., determines that only a subset of all the possible instances of the data category are interesting for your application domain; this context might be further described in the future by a new type of data categories: container data categories • Con: if too generic the data category is no more then a data type, e.g., “this is a date” • Specific data categories (context incorporated in the specification): • Pro: (semantic) search can be based on a relationship with a specific data category without the need to take the context into account (inference about the context) • Con: if too specific the data category may not be reusable in any other resource, and has almost no use for semantic interoperability (although this might be remedied by relationships in the RR with more generic data categories) • In search of the balance: • Go at least one level above the data type, e.g., /untilDate/ • If your data category is one-of-a-kind make it specific, maybe in the future others will ask you to generalize it CLARIN-NL - Call 1 - ISOcat status

  7. Granularity • A resource listing actors in a play • Which data category to associate with the name of an actor? • String (data type) • Naam (data field) • Naam van eenpersoon • Naam van eenacteur • Naam van eentoneelspeler • Naam van eentoneelspeler in mijn type resources • Naam van eentoneelspeler in mijn resource generic specific CLARIN-NL - Call 1 - ISOcat status

  8. Composite values • Some values are actually composites, e.g., “first plural exclusive vernacular” • If the composite is the lowest level in you data model you can’t link the parts to data categories as: • [a data category is the] result of the specification of a given data field [or its value](ISO 12620:2009) • However, you can link the composite value in the RR to multiple data categories or concepts; maybe using partOf or subClassOf relationships CLARIN-NL - Call 1 - ISOcat status

  9. Relation Registry • The Relation Registry is basically a triple store: • Subject: resource 1 • Predicate: relationship • Object: resource 2 • You can use Turtle (or N3 or RDF/XML or …) to specify these triples: @prefix rel : <http://www.isocat.org/rr/relations#> . @prefix isocat : <http://www.isocat.org/datcat/> . # /first plural exclusive vernacular/ is-a /vernacular/ isocat:DC-1234 rel:subClassOf isocat:DC-4 . # /first person/ part-of /first plural exclusive vernacular/ isocat:DC-1 rel:partOf isocat:DC-1234 . # /plural/ part-of /first plural exclusive vernacular/ isocat:DC-2 rel:partOf isocat:DC-1234 . # /exclusive/ part-of /first plural exclusive vernacular/ isocat:DC-3 rel:partOf isocat:DC-1234 . CLARIN-NL - Call 1 - ISOcat status

  10. Relation types • rel:related • rel:sameAs • (rel:distinct) • rel:subClassOf/rel:superClassOf • rel:narrower/rel:broader • rel:partOf, rel:directPartOf, rel:indirectPartOf • … Inspired by OWL and SKOS, but maybe other relation types are needed or other sets should be used? CLARIN-NL - Call 1 - ISOcat status

  11. Relation type taxonomy • rel:related • rel:sameAs • rel:narrower • rel:superClassOf • rel:broader • rel:subClassOf • rel:partOf • rel:directPartOf • rel:indirectPartOf • … CLARIN-NL - Call 1 - ISOcat status

  12. Relationships to other concepts • The Relation Registry accepts all kinds of URIs • So relationships can go outside of ISOcat: • Dublin Core elements • GOLD concepts • … • or your own public OWL ontology or SKOS taxonomy or … CLARIN-NL - Call 1 - ISOcat status

More Related