270 likes | 393 Views
D-square (D-kwadraat). Digital Databases and Tools for Dutch Dialect Dictionaries. Jos Swanenberg, Folkert de Vriend & Roeland van Hout. Topics. Historical background Overview of project phases Conversion procedures New encoding for data End user access to the data.
E N D
D-square(D-kwadraat) Digital Databases and Tools for Dutch Dialect Dictionaries Jos Swanenberg, Folkert de Vriend & Roeland van Hout
Topics • Historical background • Overview of project phases • Conversion procedures • New encoding for data • End user access to the data
Macro structure WBD & WLD Volumes • Agricultural terminology • Other technical or craft terminologies • Common vocabulary
Micro structure WBD & WLD Constituents: • Lexical meaning (title, description of the concept) • Lexical form (‘dutchified’ entry) • Phonetic form • Sources - Geographical code (+ map)
WBD & WLD Example of WLD, volume 1:
History of automation 1960-1980 Filing cards 1985-1995 Word processor, Genoveva 1995-2007 Databases + word proc. 2002 Online database WBD 2003 -2007 D-square
WBD & WLD Filing cards:
WBD & WLD Example of WLD, volume 1:
Online database WBD www.ru.nl/dialect
Raw data FileM Pro Edited data XML Raw data Questionnaires Nijmegen and Leuven Questionnaires (chiefly) Meertens (parts of) Vol. I+II MS-Word Vol. III FileM Pro Enriched data XML Vol. I+II MacWrite Deel III MS-Word Vol. III Filing cards Online DB WBD (Polderland) Edited data Vol. I + II Vol. III Website WBD/WLD with tools for searching and cartography Specialized print editions (dialect atlas or local dictionary) SGV on CD (Polderland)
Overview phases D-square • Conversion to a new format • End user access to data • Enrichment of data • Data management
Reasoning behind new encoding • XML, not relational database • Tailored to WBD and WLD • Flexible enough to be used for other dialect dictionaries • Based on standard: LMF (ISO TC 37/SC 4)
Example XML-encoding <LEXICON dialect="Brabants"> <ENTRY> <META> … </META> <CONCEPT lang=“dutch” ontol_id=“492”>Meikever</CONCEPT> <DATA> <VARIANT type=“heteronym”>Bakkertje <VARIANT type=“lexical”>bakkerke <VARIANT type=“raw” import=“diplomatic”>bakkərkə <LOCATION source1=“N83”>K 178</LOCATION> </VARIANT> </VARIANT> </VARIANT> </DATA> </ENTRY> … </LEXICON>
Small scale survey • - Tools: Search engine, Cartographic tool, Format conversions. • Enrichment: POS, morphemes (syllables) • - Links to other resources: Other dictionaries, questionnaires, FAND, MAND.
Difficulties to overcome • Search engine • Getting from question to query (coaching needed). Is SmartMatch (fuzzy matching) helpful in this regard? • Speed of XML searching • Cartography • Availability of base maps • Links to other resources • Differences in interpretation
Information about D-square www.ru.nl/dialect