270 likes | 368 Views
Populating the infrastructure the case of the Netherlands. Hans Bennis executive board of CLARIN-NL Meertens Institute (KNAW) CLARIN COORDINATORS BUDAPEST, June 29-30. the start in 2009. 9 million Euro for CLARIN-NL for the period 2009-2015 (requested amount m€ 25)
E N D
Populating the infrastructurethe case of the Netherlands Hans Bennis executive board of CLARIN-NL Meertens Institute (KNAW) CLARIN COORDINATORS BUDAPEST, June 29-30
the start in 2009 • 9 million Euro for CLARIN-NL for the period 2009-2015 (requested amount m€ 25) • concentration on text (language data for humanities research) • audio and video are left out, in contrast to the original proposal • social sciences are not included, in contrast to the orginal proposal • organizational structure: director, executive board, board, advisory panels (national and international) • substantial part of money will be spent in programmatic form through Calls • important goal / ambition: create broad support for CLARIN in humanities research in the Netherlands
Projects 2009 • technical projects (centers, metadata, web services, workflow, etc.) • centers: Max Planck Institute for Psycholinguistics (MPI, Nijmegen), Meertens Institute (Amsterdam), DANS (Den Haag) and Institute for Dutch Lexicology (INL, Leiden) • user survey • Call-1 (Demonstrator Projects or Resource Curation projects) • 12 projects (+/- € 60.000 each) • demonstrator projects • data curation projects
Call-1 Projects • AAM-LR [UNijmegen/MPI] - Automatic annotation of language resources • Adelheid [UNijmegen/MPI] – Lemmatizer for Historical Dutch • Adept [UGroningen/Meertens] – Dialect Analysis • Duelme-LMF [UUtrecht/INL] – Multi-word expressions • INTER-VIEWS [UNijmegen/DANS] – Interviews of life-history of veterans • MIMORE [UUtrecht/Meertens] – Dialect morphosyntax • SignLinC [UNijmegen/MPI] – Sign Language
Call-1 (more) • TDS Curator [UUtrecht/DANS] – Typological Database • TICCLops [UTilburg/INL] – Text Clean-up • TQE [UNijmegen/MPI]Transcription evaluation • WFT-GTB [Fryske Akademy/INL] – Integration of Dutch and Frisian dictionaries • CKCC [UUtrecht, Huygens Institute, DANS] Correspondence of scholars in 17th century
Demonstration of the Microcomparative Morphosyntactic Research Tool MIMORE Sjef Barbiers, Matthijs Brouwer, Jan Pieter Kunst, Folkert de Vriend Meertens Instituut, 2011
Research question • The Standard Dutch [non-neuter] relative pronoun and distal demonstrative has the form ‘die’ (that, those). • We know that there are dialects that have ‘dien’ as a relative pronoun and/or as a distal demonstrative. • We would like to know if there is a correlation between ‘dien’ as a relative pronoun, ‘dien’ as a demonstrative preceding a noun, and ‘dien’ as a demonstrative in elliptical constructions. • The linguistic question behind this search is what the ‘-n’ on ‘die’ is: case, phonologically determined, etc.?
Search 1: DynaSAND with text string and tag constructor: ‘dien’ as relative pronoun
Result of search 3: demonstrative ‘dien’ in elliptical nominal groups in DIDDD
CALL-2 (2011) • Arthurian Fiction [UUtrecht] - Curation of two databases for literary research • C-DSD [UUtrecht/Meertens] Curation of Folksong Database • COAVA [Meertens] bringing together five linguistic databases (language variation/acquisition) • INPOLDER [UNijmegen/Meertens] Syntactic analysis of historical Dutch • IPROSLA [UNijmegen/UAmsterdam/MPI] Sign language databases
CALL-2 (more) 6) NEHOL [UNijmegen] – Curation of Negerhollands database 7) VU-DNC [VU-Amsterdam] – corpus of Dutch newspapers 8) WAHSP [UUtrecht] – Text mining in large historical databases 9) WIP [NIOD] – Data curation of Dutch Second World War database
developments • collaboration with CATCH-programme (programme to finance projects for teams of ict-developers, humanities scholars and cultural heritage institutions) • CLAVAS – vocabularies • Persistent Identifiers • Data Curation Service (>2011) • Call 3 (call open now; projects in 2012) • Agreement with Dutch Science Foundation (NWO) and Royal Netherlands Academy of Science (KNAW) with respect to CLARIN-norm for databases/tools in humanities • CLARIN-NL + DARIAH-NL => CLARIAH – Dutch Roadmap