1 / 29

ISO 16642 - a tutorial Part 2: Representing data categories

ISO 16642 - a tutorial Part 2: Representing data categories. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. Why formalizing DatCats?. Systematizing data category description: Notion of Data Category Registry (DCR) I need a data category: is it there?

aya
Download Presentation

ISO 16642 - a tutorial Part 2: Representing data categories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISO 16642 - a tutorialPart 2: Representing data categories TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

  2. Why formalizing DatCats? • Systematizing data category description: • Notion of Data Category Registry (DCR) • I need a data category: is it there? • Query by name, definition etc. • Automatizing processes: • Format control of TMLs • Filters from one TML to GMT

  3. Which model for DatCats? • Using XML: • Coherence with TMF principles • Using stylesheet to generate schemas and filters • Using RDF (Resource Description Framework) • Intended format for representing meta-data: • Description of a DatCat is meta-data with regards TMF

  4. RDF - a quick presentation Cf. other file

  5. Data Categories A Formal Description

  6. Data Category Registry DCRegistry rdf:about Description dcsd:DataCategory dcsd:VersionNumber VersionNumber Data Category

  7. Data Category description DCIdentifier DCParent DCName dcsd:DCIdentifier dcsd:DCParent DCDefinition dcsd:DCName dcsd:DCDefinition dcsd:DCType DCType (S, C) Data Category dcsd:DCExample DCExample dcsd:DCAdmin dcsd:DCComment dcsd:Content dcsd:Level DCAdmin DCComment Locus Content Salt 2000-11-08/SEW

  8. Simple and complex DatCats • Complex data categories • shall serve as field identifiers (not names) in databases and can have content. The datatype for this content shall be declared for each data category and can commonly take the form of different categories of text, defined data types (such as dates), and specified data domains, e.g., picklists comprising standardized permissible instances. • Example: /Part of Speech/ • Simple data categories • shall serve as the content of complex data categories. • Example: /Noun/, /Verb/, /Adjective/ etc.

  9. Levels and content Content dcsd:DataType dcsd:TargetType Level/Loci rdf:Alt rdf:Alt TargetType DataType List of References List of References rdf:Alt rdf:li Ref to other datcats rdf:li List of References Ref to other datcat(s) rdf:li Ref to other datcat(s)

  10. Administrative properties Source Status Data Category dcsd:Source dcsd:Status dcsd:DCAdmin dcsd:StatusDate StatusDate DCAdmin dcsd:EditionDate dcsd:StatusNote dcsd:VariantNames StatusNote EditionDate VariantNames Dcsd:ShortForm Dcsd:ForbiddenName Dcsd:AdmittedName ShortForm AdmittedName ForbiddenName

  11. RDF Representation

  12. /term/ - RDF description (1) <dcsd:DataCategory dcsd:DCIdentifier="ISO12620A01" dcsd:DCName="term" dcsd:position="A.01" dcsd:DCType="C"> <dcsd:DCDefinition> A verbal designation of a general concept in a specific subject field </dcsd:DCDefinition> <dcsd:DCComment> <dcsd:sourceComment>For definition of related term, see ISO 1087-1, 3.4.3.</dcsd:sourceComment> <dcsd:conceptComment>Terms can consist of single words or be composed of multiword strings…</dcsd:conceptComment> <dcsd:Example>"radix" in annex C, figure C.1.</dcsd:Example> <dcsd:DictionnaryID>A.1</dcsd:DictionnaryID> </dcsd:DCComment>

  13. /term/ - RDF description (2) <dcsd:Content dcsd:DataType="plainText"/> <dcsd:Level> <rdf:Alt> <rdf:li>TL</rdf:li> <rdf:li>TC</rdf:li> </rdf:Alt> </dcsd:Level> <dcsd:DCAdmin dcsd:OrgSource="ISO TC 37" dcsd:DocSource="ISO12620:1999" dcsd:subDate="2000-10-20 SEW" dcsd:registryComment="Prepared 2000-10-20" dcsd:Status="Accepted"/> </dcsd:DataCategory>

  14. /term type/ - RDF description (1) <dcsd:DataCategory dcsd:DCIdentifier="ISO12620A0201" dcsd:DCName="term type" dcsd:position="A.02.01" dcsd:DCType="C"> <dcsd:DCDefinition>An attribute assigned to a term</dcsd:DCDefinition> <dcsd:DCComment> <dcsd:DictionnaryID>A.2.1</dcsd:DictionnaryID> </dcsd:DCComment> <dcsd:Content dcsd:DataType="picklist"> <rdf:Alt> <rdf:li>ISO12620A020101</rdf:li> <rdf:li>ISO12620A020102</rdf:li> <rdf:li>ISO12620A020119</rdf:li> </rdf:Alt> </dcsd:Content>

  15. /term type/ - RDF description (2) <dcsd:Level> <rdf:Alt> <rdf:li>TL</rdf:li> <rdf:li>TC</rdf:li> </rdf:Alt> </dcsd:Level> <dcsd:DCAdmin dcsd:OrgSource="ISO TC 37" dcsd:DocSource="ISO12620:1999" dcsd:subDate="2000-10-20 SEW" dcsd:registryComment="Prepared 2000-10-20" dcsd:Status="Accepted"/> </dcsd:DataCategory>

  16. Actualizing a DatCat TMF specific properties

  17. Styling properties Level Anchor Simple Element Attribute TypedElement ValuedElement TVElement AnchorInfo StyleName Data Category dcsd:Anchor dcsd:StyleName dcsd:Style dcsd:ElementName ElementName Style dcsd:Value dcsd:AttributeName dcsd:TypeValue AttributeName Value TypeValue For ‘ Simple ’

  18. Attribute style description • dcsd:StyleName="Attribute" • Conditions of use: • Not valid for annotations • Required properties • dcsd:AttributeName • Example: • dcsd:AttributeName="id" • <anchorElement id="xx54893">…</>

  19. Element style description • dcsd:StyleName="Element" • Required properties • dcsd:ElementName • Example: • dcsd: ElementName ="definition" • <definition>…</definition>

  20. TypedElement style description • dcsd:StyleName="TypedElement" • Required properties • dcsd:ElementName, dcsd:TypeValue • Example: • dcsd:ElementName ="termNote" • dcsd:TypeValue="partOfSpeech" • <termNote type="partOfSpeech"/>N</termNote>

  21. ValuedElement style description • dcsd:StyleName="ValuedElement" • Conditions of use: • Not valid for annotations • Required properties • dcsd:ElementName • Example: • dcsd:ElementName ="pos" • <pos value="noun"/>

  22. TVElement style description • dcsd:StyleName="TVElement" • Conditions of use: • Not valid for annotations • Required properties • dcsd:ElementName, dcsd:TypeValue • Example: • dcsd:ElementName ="free" • dcsd:TypeValue="pos" • <free type="pos" value="noun"/>

  23. Simple style description • dcsd:StyleName="Simple" • Conditions of use: • Express the value of simple data categories • Required properties: • dcsd:Value • Example: • dcsd:Value ="Nom" • <pos>Nom</pos>

  24. Dealing with languages

  25. Two types of languages • Working language • The language used at a given place in a document, along the XML hierarchy • Representation: xml:lang • Object language • The language about which you speak at a given place in your terminological entry (e.g. describes the Language Section level) • Representation: as a data category "language", with a narrow scope

  26. Example — DXLT <langSet lang='en’xml:lang="fr"> <descrip type="definition">Une valeur entre 0 et 1 utilisée...</descrip> <tig> <term xml:lang="en">alpha smoothing factor</term> <termNote type="termType">fullForm</termNote> </tig> </langSet>

  27. Example — GMT <struct type="LS"xml:lang="fr"> <feat type="language">en</feat> <feat type="definition">Une valeur entre 0 et 1 utilisée...</feat> <struct type="TL"> <feat type="term" xml:lang="en">alpha smoothing factor</feat> <feat type="termType">fullForm</feat> </struct> </langSet>

  28. Conclusion • A general model for analysing and representing terminological data collection • An underlying formalism expressed in XML,RDF • Associated tools (Salt project) • DCSEditor, • DCSBrowser, • Automatic generation of XSLT filters and XML schemas from a given TML specification

  29. Useful pointers • SALT project • http://www.loria.fr/projets/SALT • http://www.ttt.org/ • The TMF site • http://www.loria.fr/projets/TMF

More Related