130 likes | 256 Views
Language Technology in European Funding Programmes. Kimmo Rossi DG Information society and Media Unit E1 – Language Technology, Machine Translation. DG INFSO E in Luxembourg. Introducing DG INFSO. DG Information Society and Media 10 directorates
E N D
Language Technology in European Funding Programmes Kimmo RossiDG Information society and MediaUnit E1 – Language Technology, Machine Translation INFSO-E1 12/09/2008
DG INFSO E in Luxembourg INFSO-E1 12/09/2008
Introducing DG INFSO • DG Information Society and Media • 10 directorates • not only (research) funding agency: policy making, market regulation • strategic policy framework: i2010 • most sizeable implementation task: ICT programme of FP7 (9.1 B€) INFSO-E1 12/09/2008
Directorate E – Digital Content & cognitive systems • located in Luxembourg • 7 units, 130 people • Main themes: • Digital libraries • Public sector information • Language technology • Cognitive systems, robotics • Safer Internet • content production, (re-)use INFSO-E1 12/09/2008
Unit E1 – Language Technology, Machine Translation • Until 30/6/2008: Interaction & Interfaces • New name implies more emphasis on (novel) language technology • Changed political, technologic and societal context • EU of 23 official languages • Breakthrough of data-driven MT • Community-based approach to content production and use (Web 2.0) INFSO-E1 12/09/2008
Brief history (1993-2003) • MLIS-FP3-FP4-LE-ESPRIT • Dedicated Language Technology projects (Machine Translation, translation tools, terminology, standardisation, LRs) • some important seed technology (e.g. Translator’s Workbench, EBMT) • About 100 small projects (less than 1 MEUR) • FP5 (5th framework programme) • Applications of Language Technology (cars, mobile services, information retrieval, knowledge mgmt) • 90+ projects, 130+ MEUR funding • Mid-sized projects: 0.7 – 2.5 MEUR INFSO-E1 12/09/2008
From history to present • FP6 (6th framework programme) • 28 (20 with LT) projects, 120 MEUR • Larger projects (up to 15 MEUR) • Non-linguistic interaction: about 12 projects: TnD, TAI-CHI, ARTTS, SATIN, … • “Pure” language technology: TC-STAR, TALK, EUROMATRIX, SMART, LUNA, … • Multimodal interaction: CHIL, AMI, AMIDA, … • FP7 (7th framework programme) • Challenge 2: cognitive systems, robotics, interaction • Objective 2.2: language-based interaction INFSO-E1 12/09/2008
Current projects • EuroMatrix (www.euromatrix.net ) • Intoducing linguistics into SMT • Factored SMT (treelet alignment, morphology) • Improvement by various means (combining engines etc.) • Inventory of MT systems and language resources for SMT • SMART (www.smart-project.eu ) • Optimising SMT (algorithms, parameters, combining SMT engines) • Machine learning techniques (e.g. kernel methods) • towards adaptive MT • ITALK (italkproject.org ) • iCub robot learns language by doing • integrates action, social and linguistic skills, cognitive development • language learning from scratch, for simple tasks • ROSSI (www.rossiproject.net ) • how language development is linked to physical experience • sensori-motor grounding of human conceptualisation and language use • neurologic basis underlying the use of verbs and nouns • grounding of object affordances INFSO-E1 12/09/2008
Current projects • ALEAR (www.alear.eu ) • language evolution in robot populations • baseline: language games (Luc Steels), recruitment theory • self-organisation of conceptual frameworks and communication systems • POETICON (www.poeticon.eu ) • linking sensori-motor representations with linguistic ones • extending Lexicon into Praxicon (grammar and lexicon of action) • EMIME (www.emime.org ) • unified modelling of speech recognition and synthesis • personalised speech synthesis (“your voice speaking chinese”) • reversibility of statistical modelling techniques • FlareNet (http://www.ilc.cnr.it/flarenet/) • thematic network on Language Resources • what are Language Resources? • promotes interaction of LR stakeholders • inventory and roadmap for action • open for new members – join now! INFSO-E1 12/09/2008
Trends • New requirements – new approaches • From Web 1.X to Web 2.0 – we are all content producers • From static and uni-directional to dynamic, volatile, interactive, collaborative • Translations are needed “on the fly” • Are language technologies up to the task? • What happens to online content • Disappearing document • What is on the Internet? Who knows? Google? • Europa web site: 6 million “documents” • Disappearing distinction between content and service • How to manage (automatically?) the multilingual online “content” • From service to self-service • Travelling, banking, house-buying … • Need for language-literate systems • Multilingualism on the rise • In the EU (from 4 to 23 languages) – and the global dimension • Online content becomes more multilingual • English gains ground – but mother tongues remain INFSO-E1 12/09/2008
Challenges • Machine translation – new paradigms • adaptive, self-learning MT systems • MT that learns from its mistakes • learning through understanding and vice versa • Is there any future for RBMT? • Bringing scientific communities together • learning a common language among researchers • detecting common interest & mutual benefit • SC’s can learn from each other • Language resources • Exploit the hidden treasures (e.g. public sector resources) • Improve usability of existing resources • Identify and address gaps in coverage • Reusable, standardized, automated collection (e.g. from Web) • Towards automation: harvesting LR’s from the Web INFSO-E1 12/09/2008
Funding opportunities ahead? • Language-based interaction: priorities will be defined in next FP7 Work programme (covering call 4): http://cordis.europa.eu/fp7/dc • Online multilingualism: creation of a single European information space. Watch out for future actions in the ICT-PSP programme: http://ec.europa.eu/information_society/activities/ict_psp INFSO-E1 12/09/2008
Thank You for your attention Contact: Kimmo.Rossi@ec.europa.eu INFSO-E1 12/09/2008