70 likes | 88 Views
Principles of Lexical Description. Workgroup 1, reported by Gary Simons EMELD Workshop on Digitizing Lexical Information , 3-5 August 2002. Roadmap to Best Practice. Reviewed the proposed “Roadmap” document: Questions and discussion to clarify intent and meaning of the document.
E N D
Principles of Lexical Description Workgroup 1, reported by Gary SimonsEMELD Workshop on Digitizing Lexical Information, 3-5 August 2002
Roadmap to Best Practice • Reviewed the proposed “Roadmap” document: • Questions and discussion to clarify intent and meaning of the document. • Author made notes of minor editorial changes to make in document. • Group endorsed the principles expressed in the document.
A warning of things to come • Do we really agree with action point 2: “Recommend one or more markup schemas with best practice characteristics”? • Con: Not really possible since every dictionary will be different in the final analysis. • Pro: The average user setting off to begin a project wants an off-the-shelf solution to get started with (including a schema, editor, and stylesheets). • Day 1—Yes! Day 2—Maybe it’s not so easy.
Exemplary principles • Reviewed the DTD proposed by Bird & Bell. • Foundational principles we should follow: • Separation of information structure from rendering. • Display order belongs in stylesheet, not DTD. • A:B link in the information structure can be rendered hierarchically as A>B or B>A. • Information within lexical entry as a triple of form, morphosyntax, and sense.
Problems in proposal • <affix> element isn’t quite right. • <note> in the <head> seems odd. Is it really <comment>, as elsewhere? • <orthogrpahicForm> seems an extraneous level in some places and seems to be missing in others. • Not clear how <msi> and <pos> work. • Subentry embedding needs to be <lexeme>* (not ‘?’) or by pointer without embedding. • Subentry should relate to sense, not whole entry. • Defs and comments need xrefs to headwords.
And if we’d had more time ... • <pron> generalizes to <form>, and selecting the headword out of the kinds of forms in <form> is in the stylesheet • <aux> not factored right; <etymology> goes with whole entry but other elements go with senses. • <sense> should be recursive. • Need <definition> as distinct from <gloss>. • Needs to incorporate images and sound.
Conclusions • The Bird & Bell DTD needs more work before we would be ready to endorse it as recommended best practice. • Multiple stylesheets need to be released with a DTD in order to illustrate the separation of information and display. • [It would also be worthwhile to explore what a best practice recommendation based on the TEI tag set (with a stricter DTD) would look like.]