120 likes | 196 Views
New Markets, New Trends The technology side. Stelios Piperidis spip@ilsp.gr. Language transfer in the media and technology. subtitling is an extremely complex process the task of automatic subtitle generation can only be conceived as an asymptotic goal
E N D
New Markets, New TrendsThe technology side Stelios Piperidis spip@ilsp.gr Languages & The Media, 5 Nov 2004, Berlin
Language transfer in the media and technology subtitling is an extremely complex process the task of automatic subtitle generation can only be conceived as an asymptotic goal tools to help the human experts can be designed, implemented and integrated in existing subtitling editors subtitling professionals need to be in command and interact with tools Languages & The Media, 5 Nov 2004, Berlin
Technologies for subtitling (1) Speech Recognition • real ASR not yet at the level required for subsequent language processing • techniques to circumvent the problem • alignment of audio with script • respeaking • appropriate recording conditions and signal manipulation Languages & The Media, 5 Nov 2004, Berlin
Technologies for subtitling (2) Subtitle generation promising results when good quality ASR output is available paraphrasing linguistic processing learning of transformations through machine learning high quality transcripts and subtitles generated by subtitling professionals are necessary for training tools Languages & The Media, 5 Nov 2004, Berlin
Technologies for subtitling (3)Translation • fully automatic machine translation is a far reaching goal • productivity tools in the range of computer-aided translation tools exist • in increasing order of complexity and decreasing order • of quality they include • terminology workbenches • translation memory tools • machine translation • high quality bi/multilingual subtitles generated by subtitling • professionals are necessary for training the translation tools Languages & The Media, 5 Nov 2004, Berlin
Current trends in R&D multimedial and multilingual information processing appears to be the setting in which we will operate in the future, as it represents what is closest to real-life communicative scenaria (natural dialogue human-to-human / human-to-machine, interactive / digital TV, etc) in the multimedial/multimodal information setting we have to realise the burden put on text and its processing Languages & The Media, 5 Nov 2004, Berlin
Current trends (2) most modalities converted to text through conversion technologies (speech recognition, image captions, etc) need for robust infrastructure (resources & tools) monolingual and multilingual, as in most architectures robustness is required from the text processing layer significant progress in technologies for other modalities, e.g image processing need to exploit the possibility of fusion of modalities, e.g vision and language for real breakthroughs Languages & The Media, 5 Nov 2004, Berlin
Current trends (3) fusion of modalities will enable the production of subtitles like Audio : When scientists first flew over these peaks60 years ago, … Subtitle : When scientists flew over here60 years ago, … Visual context : peaks of mountains being the main object Languages & The Media, 5 Nov 2004, Berlin
Human Language Techs two focal issues real semantic processing, intertwined with the problem of a theory of meaning and language understanding preliminary simplified technological simulations coming up through the notion of semantic web multilinguality – language transfer – cross-lingual applications Languages & The Media, 5 Nov 2004, Berlin
Multilinguality multiple levels of computational tools in aid of the human expert • terminology extraction and management workbenches • intelligent translation memory platforms (below sentence matches, segments fusion, etc) • machine translation tools • statistical approaches to machine translation evolving user centeredness and user interaction are key issues in an operational setting Languages & The Media, 5 Nov 2004, Berlin
Customisation/Tuning widely believed that HLT can be of benefit only if customised to particular domains and applications masses of resources out there, generic or domain specific need for lexical tuning approaches so that existing resources or parts of them are directly usable by systems speed, high degree of automation of lexical tuning are key to a wide range of successful applications (machine translation, information extraction, subtitling) Languages & The Media, 5 Nov 2004, Berlin
Sharing resources and networking need for sharable, distributed language resources infrastructure well substantiated standards for representing resources (will) play a key role here sharability and availability seem to be a prerequisite networking among content holders, content providers, human experts and technology developers seems to be sine qua non Languages & The Media, 5 Nov 2004, Berlin