150 likes | 294 Views
Roadmap for Language Resources and Evaluation in a Multilingual Environment. Minority Languages in the African Context Justus Roux Centre for Language and Speech Technology (SU-CLaST) Stellenbosch University, South Africa jcr@sun.ac.za. Aim. Overview of
E N D
Roadmap for Language Resources and Evaluation in a Multilingual Environment Minority Languages in the African Context Justus Roux Centre for Language and Speech Technology (SU-CLaST) Stellenbosch University, South Africa jcr@sun.ac.za
Aim • Overview of • proceedings of the LREC2006 workshop on Networking the development of African Languages • resolutions taken at the meeting • Remarks on future development and co-operation
Background to the LREC workshop • African Language Association of Southern Africa Special Interest Group for Language and Speech Technology (ALASA-SIG) • Special Track on HLT at ALASA International Conference in Johannesburg in 2005 • National and international participants • Proceedings to appear in SA Journal of African Languages • Decision to interact with the international community via LREC2006
Why? • UNESCO Year of African Languages (2006) • Challenges in bridging the digital divide concerning African languages (connecting Africa) • R&D activities in relative isolation • Perceived need to develop resources and capacity for HLT R&D in African languages • Similar activities in NEMLAR project – Language Technology for Arabic
AIMS of Workshop • Develop an academic network for sharing ideas • Promote co-operation in the development of resources and tools (BLARKs for African languages) • Facilitate capacity building related to African languages in the context of HLT
Programme • Area surveys • West Africa • East Africa • Central Africa • Southern Africa • Projects per area • Larger projects and infrastructures • Discussion on networking possibilities
West Africa • Language Documentation paradigm: specific role of Uni Bielefeld • Doctoral students at various European universities • ALT-I: African Language Technology Institute in Ibadan • Local Language Speech Technology Initiative (Speech synthesis for Ibibio) • Initiatives in development of morphological parsers (Cologne) • West African Linguistics Society
East Africa • Text corpora on Swahili across Europe • University of Helsinki • Tools: Open Swahili Localisation Project (OSLP) – spelling checker for Swahili • Tagging tools • Localisation Microsoft Windows XP: Swahili • Morphological analysers • SALAMA: Machine Translation • Centre for Science and New Technologies & CNRS (Avignon) • Speech mining in Somali • University of Nairobi & University of Antwerp • Annotated corpora in Gikuyu and applied machine learning
Southern Africa • Extremely wide range of activities in South Africa primarily by locals (see proceedings) • University of South Africa • Morphological analysers for five African languages • Development of machine readable lexicons • University of Pretoria • Text corpora and spelling checkers • Machine-aided Translation / Localisation • Stellenbosch University Centre for Language and Speech Technology • ASR, TTS and Natural language Understanding in five languages
Southern Africa (Continued) • University of North West - Centre for Text Technology • Localisation, spelling checkers • University of Limpopo & Cape Town • Speech Synthesis • Meraka Institute (Pretoria) • Open source software for language and speech technology applications • University of the Free State & Province of Flanders • Interpreting services, data warehousing
Southern Africa (Continued) • Standardisation: • ISO/TC 37 mirror Committee (StanSA TC37) Terminology training workshops with Termnet Workshop on text annotation (Sept 2006) ISO-Meetings: Oslo (04), Warsaw (05), Beijing (06) • AFRILEX: • International conferences and workshops • National Language Service: • National Lexicography Units • National HLT Resource Centre
Larger Projects • The African Anaphora Project (Rutgers, USA) • Building an Infrastructure for Collaborative Development (Taiwan)
Decisions taken • To consolidate an inventory on tools, resources etc. available in Africa by using the on-line ELRA BLARK website • To set up a dedicated website (Wiki) to facilitate networking • The current Organising Committee will be responsible for the activities above as well as for fundraising for training workshops in Africa • To organise a similar workshop at LREC2008
Concluding impressions • European countries are playing an active role in the field in West and East Africa – to be welcomed • International organisations are becoming increasingly involved in Africa: • ISCA International Affairs Committee for Africa • ISO • ELRA?? • International co-operation in EU projects (FP7)?