620 likes | 738 Views
HLT: Current and Recent Research & Development in the Netherlands 2001-2008. Jan Odijk 24 Nov 2008. Overview. Earlier History Instruments & Programmes Players and their Projects Concluding Remarks. Earlier History. MT Projects in the 80’s Eurotra (EU, 1985- ca.1990)
E N D
HLT:Current and Recent Research & Development in the Netherlands2001-2008 Jan Odijk 24 Nov 2008
Overview • Earlier History • Instruments & Programmes • Players and their Projects • Concluding Remarks
Earlier History • MT Projects in the 80’s • Eurotra (EU, 1985- ca.1990) • Distributed MT (BSO, 1984-ca.1990) • Rosetta (Philips, 1985-1992) • Emerging Community
Earlier History • CLIN (Computational Linguistics in the Netherlands) initiated in 1990 • After TIN (Linguistics in the Netherlands) • But no association • Yearly informal conference • no or little pre-selection • Yearly Proceedings • Selection of reviewed articles • http://www.let.rug.nl/~vannoord/clin/clin.html
Earlier History • Community further strengthened by common projects in the 90’s • Priority Programme LST centered around public transportation information services (OVIS) • Corpus Gesproken Nederlands (Spoken Dutch Corpus), together with Flanders
Overview • Earlier History • Instruments & Programmes • Players and their Projects • Concluding Remarks
Instruments • NWO Pionier • NWO Vernieuwingsimpuls • Veni, Vidi, Vici • http://www.nwo.nl/nwohome.nsf/pages/NWOA_4YJDQ3 • EZ Bsik • Increase knowledge and research capacity for 5 selected areas (incl. ICT) • For mixed public/ private consortia that bundle knowledge, expertise and innovative capacity • http://www.senternovem.nl/BSIK/ • EC (IST)
Overview • Earlier History • Instruments & Programmes • Players and their Projects • Concluding Remarks
IMIX • Name: Interactive Multimodal Information Extraction • Duration: 2001-2008 • Budget: 2M€ • 4 small programmes, 3 post-doc projects • demonstrator • URL: http://www.nwo.nl/IMIX • Funding: NWO
IMIX Goals • Aims to develop knowledge and technology needed • To find specific answers to specific questions in Dutch-language documents • using multiple modalities at the input and output sides
STEVIN • Name: STEVIN • Duration: 2004-2011 • Budget: 11.4M€ • 19 R&D projects • 14 demonstration projects • Networking activities, educational activities, … • Funding: Netherlands 2/3 & Flanders 1/3 • http://taalunieversum.org/taal/technologie/stevin/
STEVIN Goals • contribute to the further progress of HLT for the Dutch language • realise an appropriate digital language infrastructure for the Dutch language • carry out strategic research in the domains of language and speech technology, in particular in areas for which there is a large demand from specific applications and technologies; • create networks and core research areas; • promote the embedding of research and educate new generations of experts; • encourage demand and knowledge transfer.
CATCH • Name: Continuous Access to Cultural Heritage • Duration: 2005-?? • Budget: 6M€ until 2008 and 3M€ in 2008 • Currently 10 running projects • URL: http://www.nwo.nl/catch • Funding: NWO + OCW. Cultural heritage institutes contribute in kind (2.8M€ so far)
CATCH Goals • Aims to develop generic methods and techniques • cutting across the areas of the humanities and computer science, • aiming to facilitate an interaction with cultural heritage institutions.
IOP MMI • Name: Innovation-oriented Research Programme (IOP) Man Machine Interaction • Duration: 1999-2003 (phase 1); 2004-2007 • Budget: ?? • URL: http://www.senternovem.nl/iopmensmachineinteractie/index.asp • Funding: EZ (Min. of Economic Affairs)
IOP MMI Goals (phase 2) • focus on the Design, Implementation and Evaluation of Intelligent Systems • Which dynamic knowledge (of one another) should systems and users acquire and apply in order to optimally achieve their goals
CLARIN-NL(?) • Name: Common Language Resource and Technology Infrastructure - Netherlands • Duration: 2009-2014 • Budget: 22M€ requested • URL: • Funding: OC&W (ESFRI)
CLARIN-NL Goals • CLARIN-NL aims to design, construct, validate, and exploit • a research infrastructure that is needed to provide a sustainable and persistent eScience working environment • for researchers in the Humanities and Social Sciences (HSS) • who want to make use of language resources and technology.
Overview • Earlier History • Instruments & Programmes • Players and their Projects • Concluding Remarks
Amsterdam (UvA) • Institute: Informatics Institute • Core Topics: • Information Retrieval • Question Answering • Key people • Maarten de Rijke
Amsterdam (UvA) • QASSIR (NWO, 2006-2010) • Question Answering as Semistructured Information Retrieval • EfFoRT (NWO, 2006-2010) • Effective Focused Retrieval Techniques • UvA, (Twente) • MultiMATCH (EU, 2006-2009) • Multilingual/Multimedia Access To Cultural Heritage • 11 partners from Europe incl. UvA
Amsterdam (UvA) • MuNCH (CATCH, (2005-2009) • Multimedia aNalysis for Cultural Heritage. • UvA, Beeld & Geluid (B&G), Digitaal Erfgoed Nederland • MuSeUM (CATCH, 2005-2009) • Multiple-collection Searching Using Metadata. • UvA, Gemeentemuseum Den Haag, Rijksbureau voor Kunsthistorisch Documentatie, Municipal Archives Rotterdam • A Model Checking Approach to Query Evaluation on XML Documents (NWO, 2004-2008) • .
Amsterdam (UvA) • FactMine (IMIX, 2004-2007) • Fact and Ontology Mining for Question Answering • UvA, DFKI, Antwerpen, Erasmus MC • AID (Dutch Government, 2004-2008) • Adaptive Information Disclosure.
Amsterdam (UvA) • ITEQA (NWO, 2004-2007) • Inference for Temporal Question Answering. • DuOMAn (STEVIN, 2008-2011) • Dutch Online Media Analysis • UvA, Groningen, Gent, TrendLight, GridLine • DAESO (STEVIN), Cornetto (STEVIN), KYOTO (EU IST), CLARIN-NL
Amsterdam (UvA) • Institute: ILLC • Topics • Data Oriented Parsing (DOP) • Key Persons: • Remko Scha • Rens Bod • Khalil Sima’an
Amsterdam (UvA) • U-DOP (NWO VICI, 2006-2011) • Unsupervised Learning with the DOP Model • DOP and Unsupervised Grammar Induction (NWO, 2004-2007) • Unsupervised stochastic grammar induction from unlabeled data • DOP and Learning Stochastic Tree-Grammars (NWO, 2003-2006)
Amsterdam (VU) • Department: Language and Communication • Core Topics: • Computational Lexicology • Key people • Piek Vossen
Amsterdam (VU) • Cornetto (STEVIN, 2006-2008) • Combinatorial and Relational Network as Toolkit for Dutch Language Technology • VU, UvA, Leuven, Irion • KYOTO (EU FP7 ICT, 2008-2011) • Knowledge Yielding Ontologies for Transition-based Organization • VU, 8 other European partners • CLARIN
Groningen • Department: Centre for Language and Cognition/ Computational Linguistics • Core Topics: • Syntax and Parsing • Key people • John Nerbonne • Gertjan van Noord • Gosse Bouma
Groningen • Alpino (NWO PIONIER, 2000-2005) • Algorithms for Linguistic Processing • QADR (IMIX, 2004-2008) • Question Answering for Dutch using Dependency Relations • Groningen, Spectrum • COREA (STEVIN, 2005-2007) • Coreference Resolution for Extracting Answers • Groningen, Antwerpen, Language and Computing
Groningen • LASSY (STEVIN, 2006-2009) • Large Scale Syntactic Annotation of written Dutch • Groningen, Leuven • SCRATCH (CATCH, ??-??) • SCRipt Analysis Tools for the Cultural Heritage • Groningen, Nationaal Archief • D-COI (STEVIN), IRME (STEVIN), DAISY (STEVIN), DuOMAn (STEVIN), PaCo-MT (STEVIN), CLARIN, CLARIN-NL
Nijmegen • Institute: Centre for Language and Speech Technology • Core Topics: • Speech Processing • Language Resource Development • Key people • Lou Boves, Nelleke Oostdijk, Helmer Strik, Henk van den Heuvel
Nijmegen • NORISC (IMIX, 2004-2007) • Next generatiOn template based Recognition for Interactive man-machine Speech Communication • COMIC (EU IST, 2002-2005) • COnversational Multimodal Interaction with Computers • MATIS (IOP-MMI, ??-??) • Multimodal Access to Transaction and Information Services
Nijmegen • D-COI (STEVIN, 2005-2006) • Dutch Language Corpus Initiative • Nijmegen, Tilburg, Twente, Groningen, Utrecht, Leuven, Polderland • SoNaR (STEVIN, 2008-2011) • STEVIN Nederlandstalig Referentiecorpus • Nijmegen, Tilburg, Twente, Utrecht, Leuven, Gent • JASMIN-CGN (STEVIN, 2005-2007) • Extension of CGN with speech of children, non-natives, elderly and human-machine interaction • Nijmegen, Leuven, Talkinghome
Nijmegen • ACORNS (EU, 2008-2010) • Acquisition of Communication and Recognition Skills • “Intends to […] create an artificial agent that is capable of acquiring human verbal communication behaviour “ • Nijmegen, 5 other European partners • Avoiding the ham in hamster (NWO VENI 2006-2010) • Modelling the use of non-segmental information in human spoken-word recognition
Nijmegen • BATS (ICTRegie, IBBT, 2008-2012) • Topic and Speaker Tracking in Broadcast Archives • Nijmegen, Leuven • SPEX • Speech Processing Expertise Centre • Collection, Annotation & Validation • ELRA Speech Validation Centre
Nijmegen • Autonomata TOO (STEVIN, 2008-2010) • Autonomata Transfer of Output • Nijmegen, Gent, Utrecht, TeleAtlas, Nuance • DISCO (STEVIN, 2008-2011) • Development and Integration of Speech technology into COurseware for language learning • Nijmegen, Linguapolis Antwerpen, Taal- & Communicatiecentrum Nijmegen, Polderland • Autonomata (STEVIN), MIDAS (STEVIN), NBest (STEVIN), Praat (STEVIN), SPRAAK (STEVIN), CLARIN, CLARIN-NL, A Propos (IOP-MMI)
Soesterberg • Institute: TNO Defense & Security • Core Topics: • Speech Processing • Speech Technology Evaluation • Key people • David van Leeuwen
Soesterberg • NBest (STEVIN, 2006-2008) • Northern and Southern Dutch Benchmark Evaluation of Speech recognition Technology • TNO Soesterberg, Nijmegen, Twente, Leuven, Gent. Delft • SPRAAK (STEVIN)
Tilburg • Institute: Tilburg Centre for Creative Computing, Induction of Linguistic Knowledge (ILK) Research group • Core Topics: • Machine-learning • Memory-Based learning • Key people • Antal van den Bosch
Tilburg • ROLAQUAD (IMIX, 2004-2008) • Robust Language Understanding in Question-Answering Dialogue • Tilburg, Textkernel • Implicit Linguistics (NWO VICI, 2005-2009) • Machine Learning of Text-to-Text Processing, • A Propos (IOP MMI, 2006-2009?) • Proactive Personalization for Professional Document Writing • Tilburg, Nijmegen, industrial partners
Tilburg • MITCH (CATCH, 2007-2009?) • Mining for Information in Texts from the Cultural Heritage • National Museum of Natural History, Tilburg • D-COI (STEVIN), CLARIN, CLARIN-NL, SoNaR (STEVIN)
Tilburg • Institute: Communication and Cognition • Core Topics: • Communication and Cognition • Language Generation • Multimodality • Key people • Emiel Krahmer
Tilburg • IMOGEN (IMIX, 2004-2008) • Interactive Multimodal Output Generation • Tilburg and Twente • Bridging the gap between psycholinguistics and computational linguistics (NWO VICI, 2007-2011) • Generation of referring expressions • DAESO (STEVIN, 2006-2009) • Detecting and Exploiting Semantic Overlap • Tilburg, Antwerpen, Amsterdam (UvA), Textkernel
Tilburg • TUNA (EPSRC (UK), 2003-2007) • Towards a Unified Algorithm for the Generation of Referring Expressions • Aberdeen, Open University (UK), Tilburg • FOAP (NWO Vidi, 2003-2007) • Functions of Audio-Visual Prosody
Tilburg • Institute: Communication Information Sciences • Core Topics: • Computational Semantics and Pragmatics • Dialogue Theory • Multimodal Interaction • Key people • Harry Bunt
Tilburg • Paradime (IMIX, 2004-2008) • Parallel Agent-based Dialogue Management Engine
Twente • Institute: Human Media Interaction • Core Topics • Multimodality • Speech Recognition • Key people • Franciska de Jong, Anton Nijholt • Arjan van Hessen, Roelof Ordelman
Twente • AMI (EU IST, 2004-2006) • Augmented Multi-party Interaction • IDIAP (CH) and 11 other partners incl. Twente • M4 (EU IST, 2002-2005) • MultiModal Meeting Manager • Sheffield and 8 other partners incl. Twente • DRUID (Telematica ??-2003) • Multimedia Indexing & Retrieval on the basis of Image Processing & Language and Speech Technology • TNO TPD, TNO TM,Twente, CWI
Twente • SAFIR (EU, 2003-2007) • Speech Automatic Friendly Interface Research • 17 European partners incl. Twente • Angelica (NWO Meervoud (?), 2003-2007) • A Natural-Language Generator for Embodied, Lifelike Conversational Agents