SALT XLT Markup and Mapping in Termbases - Empirical Experience -

Klaus-Dirk Schmitz University of Applied Sciences Cologne Institute for Information Science klaus.schmitz@fh-koeln.de

SALT XLT Markup and Mapping in Termbases - Empirical Experience -

  1. SALTXLT Markup and Mapping in Termbases- Empirical Experience - Klaus-Dirk Schmitz University of Applied Sciences Cologne Institute for Information Science klaus.schmitz@fh-koeln.de

  2. SALT - Work Package 4 • Analyse in detail data categories and format structures of existing terminological data collections and formats in order to develop conceptual mapping tables and procedures to and from DXLT. • This will serve as technical specifications for the develop-ment and implementation of filters (converters) from specific databases and formats to DXLT and vice versa. • Based on Deliverable 2 / 3.1 describing existing terminological data formats and structures of concrete sample data.

  3. One of the formats: Eurodicautom • Eurodicautom is the terminological databank of the EU Commission, developed and filled with data since the end of the 60ties. • Eurodicautom is a main frame application with more than 1 million records, and since some years, the data are provided to the public via a web interface.

  4. Eurodicautom: Sample Data %%BE BTB %%TY DAG77 %%NI 612 %%CF 3 %%CM AG4 CH6 GO6 %%DA %%VE C/N kvotient[1];kulstof-kvælstofforhold[2] %%RF A.Klougart[VE1,VE2] %%EN %%VE C-N ratio %%RF CILF,Dict.Agriculture,ACCT,1977

  5. Eurodicautom: Sample Data %%NL %%VE C/N-quotient[1];koolstof-stikstofverhouding[2] %%DF (in bodem)verhouding vh totale koolstofgehalte tot het totale stikstofgehalte van organische stoffen... %%RF Agr.WP[VE1,VE2];Huitenga,Landbouwwdbk N-E[VE1,VE2] %%NT {NTE}(in plant)verhouding van koolstof en stikstof(koolhydraten en eiwitten)...[VE1,VE2]

  6. Eurodicautom: Data Structure • After a general block of entry-related (concept-related) information, language blocks are repeated for each of the EU languages. • Every data category can only appear once within a language block, i.e. only one data category for all terms in one language. • The Note field can be ”structured” by unique starting tags that can be seen as ”virtual” data categories.

  7. Eurodicautom: Data Categories (part) %%BE (EU) terminology service responsible for the entry (M) %%TY ”collection” code (M) %%NI entry number (M) %%NX entry number for updating (R) %%NZ entry number for deleting (R) %%CF reliability code (1 lowest, 5 highest) %%AU author, originator %%DATE date (of last modification) %%CM subject field (Lenoch Code) (M) ......... M=Mandatory / R=Rare or old

  8. Eurodicautom: A first DTD (part) <!-- DTD for EURODICAUTOM KDS 30.8.2000 --> <!ELEMENT EURODICAUTOM (entry+ )> <!ELEMENT entry (BE , TY , (NZ | NX | NZ ) , CF , AU? , DATE? , CM* , langSet+)> <!ELEMENT BE (#PCDATA )> <!ELEMENT TY (#PCDATA )> ... <!ELEMENT langSet (VE?, AB?, PH?, DF?, RF?, MC?, MC?, NT?)> <!ATTLIST langSet lang CDATA #IMPLIED > <!ELEMENT VE (term , termID? )*> <!ELEMENT AB (term , termID? )*> <!ELEMENT PH (term , termID? )*> <!ELEMENT RF (text , refID* )*> <!ELEMENT term (#PCDATA )> ...

  9. Eurodicautom: Graphical Representation

  10. Eurodicautom: Graphical Representation

  11. Eurodicautom: Mapping Procedure (Part)

  12. Eurodicautom: Mapping Procedure (Part)

  13. Eurodicautom: Mapping Procedure (Part)

  14. Eurodicautom: Mapping Procedure (Part)

  15. Eurodicautom: Hand-coded XML (Part)

  16. Eurodicautom: Hand-coded XML (Part)

  17. Eurodicautom: Hand-coded XML (Part)

  18. Eurodicautom: Hand-coded XML (Part)

  19. Eurodicautom: Hand-coded XML (Part)

