80 likes | 237 Views
The ISO 12620 Data Category Registry. ISO 12620:2009 introduces A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future) container Data Categories (DCs) ISO DIS 24619 compliant Persistent IDentifiers (PIDs) for each DC, e.g.,
E N D
The ISO 12620 Data Category Registry • ISO 12620:2009 introduces • A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future) container Data Categories (DCs) • ISO DIS 24619 compliant Persistent IDentifiers (PIDs) for each DC, e.g., http://www.isocat.org/datcat/DC-396 • The DC Reference schema, a small XML vocabulary, to embed these DC PIDs in XML documents, e.g., <rng:element name="POS" dcr:datcat="http://.../DC-396" />
Standards and Data Category references • Some standards already provide their own constructs for embedded DC references • However, these constructs sometimes • Use ambiguous DC identifiers instead of PIDs • Are not able to handle the current DC PIDs • Do not cover all DC types, i.e., container, complex and simple DCs
Improving the current situation • Use Relax NG, XML Schema or ODD instead of DTD • Create open schemas, which allow adding attributes and/or elements from foreign namespaces, or embed dcr:datcat or dcr:valueDatcat hooks at the proper places in the schemas • The DC Reference vocabulary can then be used to embed DC references for various DC types at the right places • For existing specifications with some support for DC references, make sure all relevant DC types can be covered, and make use of DC PIDs
References • Latest version of the DC References vocabulary • http://www.isocat.org/12620/ • Survey of the support for DC references • M.A. Windhouwer, S.E. Wright, M. Kemps-Snijders. Referencing ISOcat data categories. In proceedings of the LRT standards workshop (LREC 2010), Malta, May 18, 2010. • http://www.lrec-conf.org/proceedings/lrec2010/workshops/W4.pdf
ODD example <elementSpecxmlns="http://www.tei-c.org/ns/1.0" module="header” ident="availability"> <equiv name="availability" uri="http://lux13.mpi.nl/datcat/DC-12094"/>… <attList> <attDefident="status" usage="opt"><equiv name="availabilityStatus" uri="http://lux13.mpi.nl/datcat/DC-12019"/> <defaultVal>unknown</defaultVal> <valList type="closed"> <valItemident="free"> <equiv name="availabilityStatusFree" uri="http://lux13.mpi.nl/datcat/DC-12020"/><desc>the text is freely available.</desc> … Note: this example does use PIDs from the ISOcat test server.
LMF example <LexicalResourcexmlns:dcr="http://www.isocat.org/ns/dcr"> … <LexicalEntry> <feat att="partOfSpeech" dcr:datcat="http://www.isocat.org/datcat/DC-1345" val="commonNoun" dcr:valueDatcat="http://www.isocat.org/datcat/DC-1256"/> <Lemma> <feat att="writtenForm" dcr:datcat="http://www.isocat.org/datcat/DC-1836" val="clergyman"/> </Lemma> … Note: once the DCR supports container data categories LexicalResource, LexicalEntry and Lemma could also have dcr:datcat attributes.
LAF example <typeDescription loc="http://www.isocat.org/..."> <name> <description> <supertypeName> <features> <featureDescription loc="http://www.isocat.org/..."> <name> <description> <values> <valueDescription loc="http://www.isocat.org/..."> … Note: each value needs it’s own DC reference hence the addition of the valueDescription element.