1 / 27

Populating the infrastructure the case of the Netherlands

Explore the development of the CLARIN infrastructure in the Netherlands since 2009, focusing on text data for humanities research, technical projects, and collaborations with various institutions. This includes funded projects like MIMORE for dialect analysis and integration of Dutch and Frisian dictionaries. Learn about ongoing efforts to support research in the humanities through CLARIN and the Dutch Science Foundation.

unad
Download Presentation

Populating the infrastructure the case of the Netherlands

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Populating the infrastructurethe case of the Netherlands Hans Bennis executive board of CLARIN-NL Meertens Institute (KNAW) CLARIN COORDINATORS BUDAPEST, June 29-30

  2. the start in 2009 • 9 million Euro for CLARIN-NL for the period 2009-2015 (requested amount m€ 25) • concentration on text (language data for humanities research) • audio and video are left out, in contrast to the original proposal • social sciences are not included, in contrast to the orginal proposal • organizational structure: director, executive board, board, advisory panels (national and international) • substantial part of money will be spent in programmatic form through Calls • important goal / ambition: create broad support for CLARIN in humanities research in the Netherlands

  3. Projects 2009 • technical projects (centers, metadata, web services, workflow, etc.) • centers: Max Planck Institute for Psycholinguistics (MPI, Nijmegen), Meertens Institute (Amsterdam), DANS (Den Haag) and Institute for Dutch Lexicology (INL, Leiden) • user survey • Call-1 (Demonstrator Projects or Resource Curation projects) • 12 projects (+/- € 60.000 each) • demonstrator projects • data curation projects

  4. Call-1 Projects • AAM-LR [UNijmegen/MPI] - Automatic annotation of language resources • Adelheid [UNijmegen/MPI] – Lemmatizer for Historical Dutch • Adept [UGroningen/Meertens] – Dialect Analysis • Duelme-LMF [UUtrecht/INL] – Multi-word expressions • INTER-VIEWS [UNijmegen/DANS] – Interviews of life-history of veterans • MIMORE [UUtrecht/Meertens] – Dialect morphosyntax • SignLinC [UNijmegen/MPI] – Sign Language

  5. Call-1 (more) • TDS Curator [UUtrecht/DANS] – Typological Database • TICCLops [UTilburg/INL] – Text Clean-up • TQE [UNijmegen/MPI]Transcription evaluation • WFT-GTB [Fryske Akademy/INL] – Integration of Dutch and Frisian dictionaries • CKCC [UUtrecht, Huygens Institute, DANS] Correspondence of scholars in 17th century

  6. Demonstration of the Microcomparative Morphosyntactic Research Tool MIMORE Sjef Barbiers, Matthijs Brouwer, Jan Pieter Kunst, Folkert de Vriend Meertens Instituut, 2011

  7. Opening screen MIMORE

  8. Research question • The Standard Dutch [non-neuter] relative pronoun and distal demonstrative has the form ‘die’ (that, those). • We know that there are dialects that have ‘dien’ as a relative pronoun and/or as a distal demonstrative. • We would like to know if there is a correlation between ‘dien’ as a relative pronoun, ‘dien’ as a demonstrative preceding a noun, and ‘dien’ as a demonstrative in elliptical constructions. • The linguistic question behind this search is what the ‘-n’ on ‘die’ is: case, phonologically determined, etc.?

  9. Optional restrictions on the search

  10. Search 1: DynaSAND with text string and tag constructor: ‘dien’ as relative pronoun

  11. Elements of search result

  12. Specification of data resource

  13. Corresponding sound fragment

  14. Search 2: GTRP with demonstrative + N in test item

  15. Elements of search result

  16. Result of search 3: demonstrative ‘dien’ in elliptical nominal groups in DIDDD

  17. Available operations on search results

  18. Map combining three search results

  19. Map combiningtwo search results

  20. Frequency maps

  21. Creating the intersection of two sets of search results

  22. Export as Excel-file

  23. Data exported

  24. Complex search: More thanone database, string of tags

  25. CALL-2 (2011) • Arthurian Fiction [UUtrecht] - Curation of two databases for literary research • C-DSD [UUtrecht/Meertens] Curation of Folksong Database • COAVA [Meertens] bringing together five linguistic databases (language variation/acquisition) • INPOLDER [UNijmegen/Meertens] Syntactic analysis of historical Dutch • IPROSLA [UNijmegen/UAmsterdam/MPI] Sign language databases

  26. CALL-2 (more) 6) NEHOL [UNijmegen] – Curation of Negerhollands database 7) VU-DNC [VU-Amsterdam] – corpus of Dutch newspapers 8) WAHSP [UUtrecht] – Text mining in large historical databases 9) WIP [NIOD] – Data curation of Dutch Second World War database

  27. developments • collaboration with CATCH-programme (programme to finance projects for teams of ict-developers, humanities scholars and cultural heritage institutions) • CLAVAS – vocabularies • Persistent Identifiers • Data Curation Service (>2011) • Call 3 (call open now; projects in 2012) • Agreement with Dutch Science Foundation (NWO) and Royal Netherlands Academy of Science (KNAW) with respect to CLARIN-norm for databases/tools in humanities • CLARIN-NL + DARIAH-NL => CLARIAH – Dutch Roadmap

More Related