130 likes | 235 Views
E-MELD Meeting 2006 Report: Working Group 3: 'Lexicon creation' Steve Abney, D unstan Brown, Östen Dahl, Sebastian Drude, Susanna Imrie, Marc Kemps-Snijder, Christopher Manning, Mike Maxwel, Vivian Ngai. Tools and Standards: The State of the Art. Lexical databases and tools.
E N D
E-MELD Meeting 2006 Report: Working Group 3: 'Lexicon creation' Steve Abney, Dunstan Brown, Östen Dahl, Sebastian Drude, Susanna Imrie, Marc Kemps-Snijder, Christopher Manning, Mike Maxwel, Vivian Ngai Tools and Standards: The State of the Art
Lexical databases and tools • General remarks, ecology • Comments on the tools-page • description of needed tools and standards
Lexical databases and tools • General remarks, ecology • Comments on the tools-page • description of needed tools and standards
General 'Lexicon creation': many more issues than this, considering lexical data in its 'environment'ː - interoperability with other types of data - search on lexical data - presentation and archivation ... Position wrt. Documentation / Description
Lexical databases and tools • General remarks, ecology • Comments on the tools-page • description of needed tools and standards
Comments on tool pages Format of presentationː • Eliminate ratings, add keywords (main functionalities) • Using basic tasks in the workflow related to lexical databasesː • Interaction with (interlinear) texts • Concordance etc. • Consistency control • Output / presentation formats
Comments on tool pages Workflow / tasks: • Word discovery, creation of entries • Enrichment of information on lexical units • Revisions, cleaning up • merger with other databases, collaboration • Queries, data mining / retrieval • Output + Presentation
Comments on tool pages Potentially useful additions: • Links to some exemplary on-line dictionaries • Word lists for elicitation • Mention general tools • Mention Wiktionary-technology
Lexical databases and tools • General remarks, ecology • Comments on the tools-page • needed tools and standards
Needed tools and standards Tools: • Consistency control / management (values of data categories, structure) • Morphological parsers (modules) • Version Control, collaboration
Needed tools and standards Standards • It is difficult to imagine a widely accepted standard for a fixed microstructure for lexical entries • There should nevertheless be proposals / templates, especially for specific areas • There can be repositories of terminology / data categories, to choose from or for orientation • Also, repositories for values (controlled vocabularies -> GOLD, semantic domains...)
Needed tools and standards There are some proposals for standards that should be referred to for orientation: • MDF • OLIF • LMF (ISO) Nevertheless, tools should generally allow for customization of the structure