110 likes | 120 Views
CLARIN Common Language Resources and Technology Infrastructure Frank Van Eynde Center for Computational Linguistics KU Leuven. A Fact about HSS. In the Humanities and the Social Sciences (HSS) much of the research is language based Linguistics Literary studies History Philosophy
E N D
CLARIN Common Language Resources and Technology Infrastructure Frank Van Eynde Center for Computational Linguistics KU Leuven
A Fact about HSS In the Humanities and the Social Sciences (HSS) much of the research is language based • Linguistics • Literary studies • History • Philosophy • Theology • Communication sciences • Branches of sociology, law and psychology For the HSS scholar the universe of texts and recorded speech is what the cosmos is for the astronomer and the natural world for the biologist.
A Need of HSS • Biology has benefited from the microscope. • Astronomy has benefited from the telescope. • HSS can benefit from the tools and resources which have been developed in the fields of language and speech technology (LST). • BUT: the LST products are insufficiently available to HSS researchers and HSS researchers are insufficiently prepared to make good use of them. • ERGO: the great potential of LST for HSS is not yet realized.
The answer : CLARIN • The pan-European CLARIN program aims to set up and maintain a persistent research infrastructure allowing HSS researchers to use state-of-the-art LST products and resources. • How? Turning the existing fragmented technology and resources into an accessible, flexible and stable services network available from the user's desktop. • Special attention for user-friendliness: easy to use, also for people with a limited knowledge of technology. • Opening new perspectives for research in the Humanities and the Social Sciences.
HSS Researcher User is only confronted with the ‘outside’, the internal organisation of the RI is hidden
CLARIN : scope and status • All languages spoken and/or studied in the European countries are covered (+/- 100) • For each country • its official language(s), plus • languages outside Europe (Inuit languages, native-American languages, African languages, extinct languages, …) • Identified in 2006 by ESFRI as a major research infrastructure for humanities and social sciences • 156 members, of which 33 actively participated in the European part of the preparatory phase (2008-2010)
Preparatory Phase • TTNWW Language and speech technology for Dutch as a web service InekeSchuurman (CCL, Leuven) Marc Kemps-Snijders (MeertensInstituut, Amsterdam) • STYLENE An environment for stylometry and readability research Walter Daelemans (CLIPS, Antwerpen) Veronique Hoste (LT3, Gent)
Preparatory Phase • NEDERBOOMS Treebank mining for data-based linguistics LiesbethAugustinus (CCL, Leuven) Vincent Vandeghinste (CCL, Leuven) • SPRAAK2TAAL A web service for sentiment and opinion analysis in written and spoken discourse SienMoens (Computer Science, Leuven) Patrick Wambacq (ESAT, Leuven)
Construction Phase MIMORE A microcomparativemorphosyntactic research tool SjefBarbiers (MeertensInstituut, Amsterdam) WAHSP A web application for historical sentiment mining in public media ToinePieters (Huygens Instituut, Utrecht)
Keynote speaker • Jan Odijk • Master in Slavonic languages and literature, • General linguistics (Utrecht) • PhD on Compositionality and syntactic generalizations (Tilburg) • Rosetta (Philips, Eindhoven) • Director Linguistic Resources (Lernout & Hauspie, Ieper) • Director Linguistic Resources (Nuance, Merelbeke) • Professor in Language & Speech Technology (Utrecht) • Chairman of the Dutch-Flemish STEVIN program committee
THANK YOU ! http://www.ccl.kuleuven.be/CLARIN