190 likes | 329 Views
Shared resources, shared values? Ethical implications of sharing translation resources. Jo Drugan and Bogdan Babych University of Leeds, UK www.leeds.ac.uk/cts. Overview. Practical challenges to sharing translation resources, but also ethical and legal problems
E N D
Shared resources, shared values? Ethical implications of sharing translation resources Jo Drugan and Bogdan Babych University of Leeds, UK www.leeds.ac.uk/cts EM+/CNGL workshop
Overview • Practical challenges to sharing translation resources, but also ethical and legal problems • Recent collaboration and greater openness, but focus generally on practical issues • Good reasons for failure to broach ethics • Yet essential to do so – huge and growing demand for translation can’t be met without sharing • Questions users and developers should be asking and suggested ways forward EM+/CNGL workshop
Talk map • Practical problems in sharing translation resources • Ethical problems in sharing translation resources • Case studies • Google Translation Toolkit • TAUS Language Search Engine (LSE) • Conclusion EM+/CNGL workshop
Sharing translation resources: Practical problems • Exploitation of large parallel corpora to create/populate translation resources hampered by: • “Locked-in” data: range of tools • Ineffective exchange formats • Vashee 2010: ‘Translation tools often trap your data in a silo because the vendors WANT to lock you in and make it painful for you to leave’ • Client reservations EM+/CNGL workshop
Recent progress on practical problems • Large minable multilingual corpora released online since 1990s • Canadian Hansard, UN texts, Europarl corpus • Large-scale SMT platforms rely on such parallel corpora • European Union TM archive, 2007 • Translation Automation User Society (TAUS), 2007 • Shared online Translation Environment Tools (TenTs), crowdsourced/collaborative translation EM+/CNGL workshop
Sharing translation resources and MT • Koehn 2010: SMT is domain-dependent to much greater degree than RBMT • Lower quality of out-of-domain translation • Sharing translation resources essential for building high-quality SMT systems • Range of text types/subject domains • Requires consideration of ethical and legal issues EM+/CNGL workshop
And ethics?... • Conspicuous by its absence: limited to issues of (informed) consent and ‘threats’ to translators • Improved MT quality • Collaborative translation • Yet familiar issues • Trailblazers (Wikipedia) • Legal grey areas (translation as international activity par excellence) EM+/CNGL workshop
Consequences? • Two standard reactions: • ‘Don’t ask, don’t tell’ • Risks of burying your head in the sand • Legal implications, traceability • Excessive caution • Passing up potentially valuable data EM+/CNGL workshop
Consequences - MT? • ‘What has ethics got to do with MT?’ • Sharing translation resources requires consideration of ethical and legal issues • Confidentiality of data • Trade, industrial, state secrets • Intellectual property rights (moral rights?) of translators, authors, data owners EM+/CNGL workshop
Engaging with ethics • Share data confidently, arguing from clearly stated values • Draw on precedents in related fields/debates • Essential because sharing is increasingly the norm • TAUS: Information Age = ‘insatiable demand for translation services that cannot be met with existing proprietary business models and the capacity of around 300 000 professional translators worldwide’ • One way in: case studies • Ethical questions raised by what’s actually happening EM+/CNGL workshop
1. Google Translation Toolkit • SMT • Since 2005, http://translate.google.com/ • 58 language pairs in 2010 • For assimilation, typically not integrated in translation workflow • MT post-editing concerns • Google move to embed MT in online collaborative translation environment: Google Translation Toolkit EM+/CNGL workshop
Google Translation Toolkit • MT integrated with TM and user dictionary functionality • TM matches/user dictionary entries have priority but post-edit MT output if not available • Translators collaborate, as for Google Docs • Stored on ‘cloud’ servers but can be downloaded • User options, no MT if preferred • But limiting factors… EM+/CNGL workshop
Limiting factors • Ethical rather than technological • No.1: Confidentiality of project and resources • Not practical for most real-world professional projects • Technically possible to address translators’/clients’ concerns • Default settings EM+/CNGL workshop
Other ethical issues not addressed • Recognition, compensation of translators’ work • Potential legal consequences • Other tools support such approaches: http://mymemory.translated.net/doc/ • Ownership, attribution • Familiar issues • Potentially useful innovative technology falls down because it fails to take into account practical user-based scenarios, in part due to inadequate ethical framework EM+/CNGL workshop
2. TAUS Language Search Engine (LSE) • Online tool for searching uploaded TMX data • Parallel concordances, word alignment techniques • Intelligent dictionary • User (mis)expectations • Ethical framework is explicit – even a ‘model’ • User consent • Quid pro quo • Data owner responsibility EM+/CNGL workshop
But key questions remain unaddressed • Ethical, not technical • Ownership and consent – broader issues • ‘Community of users and providers of translation technologies and services’ – but all large-scale, not end users or freelance translators • Informed consent? • NB not legal/contractual - broader • Industry codes of ethics, ‘taking credit for others’ work’ • UNESCO 1976, ‘supplementary payment’? EM+/CNGL workshop
Key questions unaddressed • Translator choice? • Should ultimate responsibility afford claims to ultimate ownership? • Avoiding harm? • Effects on future translation quality judgments? EM+/CNGL workshop
Positively ethical • The aims and ambitions of these two initiatives can be seen as profoundly ethical • Relevant principles in codes: • Professional review, informed critiques, raise standards, improve public understanding, contribute to society and human well-being, respect human diversity, support fellow professionals, contribute to profession’s standing, enhance quality of life • Not just defensive, but allows case to be made for action rather than inaction EM+/CNGL workshop
Questions/Discussion EM+/CNGL workshop