1 / 19

Shared resources, shared values? Ethical implications of sharing translation resources

Shared resources, shared values? Ethical implications of sharing translation resources. Jo Drugan and Bogdan Babych University of Leeds, UK www.leeds.ac.uk/cts. Overview. Practical challenges to sharing translation resources, but also ethical and legal problems

waylon
Download Presentation

Shared resources, shared values? Ethical implications of sharing translation resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shared resources, shared values? Ethical implications of sharing translation resources Jo Drugan and Bogdan Babych University of Leeds, UK www.leeds.ac.uk/cts EM+/CNGL workshop

  2. Overview • Practical challenges to sharing translation resources, but also ethical and legal problems • Recent collaboration and greater openness, but focus generally on practical issues • Good reasons for failure to broach ethics • Yet essential to do so – huge and growing demand for translation can’t be met without sharing • Questions users and developers should be asking and suggested ways forward EM+/CNGL workshop

  3. Talk map • Practical problems in sharing translation resources • Ethical problems in sharing translation resources • Case studies • Google Translation Toolkit • TAUS Language Search Engine (LSE) • Conclusion EM+/CNGL workshop

  4. Sharing translation resources: Practical problems • Exploitation of large parallel corpora to create/populate translation resources hampered by: • “Locked-in” data: range of tools • Ineffective exchange formats • Vashee 2010: ‘Translation tools often trap your data in a silo because the vendors WANT to lock you in and make it painful for you to leave’ • Client reservations EM+/CNGL workshop

  5. Recent progress on practical problems • Large minable multilingual corpora released online since 1990s • Canadian Hansard, UN texts, Europarl corpus • Large-scale SMT platforms rely on such parallel corpora • European Union TM archive, 2007 • Translation Automation User Society (TAUS), 2007 • Shared online Translation Environment Tools (TenTs), crowdsourced/collaborative translation EM+/CNGL workshop

  6. Sharing translation resources and MT • Koehn 2010: SMT is domain-dependent to much greater degree than RBMT • Lower quality of out-of-domain translation • Sharing translation resources essential for building high-quality SMT systems • Range of text types/subject domains • Requires consideration of ethical and legal issues EM+/CNGL workshop

  7. And ethics?... • Conspicuous by its absence: limited to issues of (informed) consent and ‘threats’ to translators • Improved MT quality • Collaborative translation • Yet familiar issues • Trailblazers (Wikipedia) • Legal grey areas (translation as international activity par excellence) EM+/CNGL workshop

  8. Consequences? • Two standard reactions: • ‘Don’t ask, don’t tell’ • Risks of burying your head in the sand • Legal implications, traceability • Excessive caution • Passing up potentially valuable data EM+/CNGL workshop

  9. Consequences - MT? • ‘What has ethics got to do with MT?’ • Sharing translation resources requires consideration of ethical and legal issues • Confidentiality of data • Trade, industrial, state secrets • Intellectual property rights (moral rights?) of translators, authors, data owners EM+/CNGL workshop

  10. Engaging with ethics • Share data confidently, arguing from clearly stated values • Draw on precedents in related fields/debates • Essential because sharing is increasingly the norm • TAUS: Information Age = ‘insatiable demand for translation services that cannot be met with existing proprietary business models and the capacity of around 300 000 professional translators worldwide’ • One way in: case studies • Ethical questions raised by what’s actually happening EM+/CNGL workshop

  11. 1. Google Translation Toolkit • SMT • Since 2005, http://translate.google.com/ • 58 language pairs in 2010 • For assimilation, typically not integrated in translation workflow • MT post-editing concerns • Google move to embed MT in online collaborative translation environment: Google Translation Toolkit EM+/CNGL workshop

  12. Google Translation Toolkit • MT integrated with TM and user dictionary functionality • TM matches/user dictionary entries have priority but post-edit MT output if not available • Translators collaborate, as for Google Docs • Stored on ‘cloud’ servers but can be downloaded • User options, no MT if preferred • But limiting factors… EM+/CNGL workshop

  13. Limiting factors • Ethical rather than technological • No.1: Confidentiality of project and resources • Not practical for most real-world professional projects • Technically possible to address translators’/clients’ concerns • Default settings EM+/CNGL workshop

  14. Other ethical issues not addressed • Recognition, compensation of translators’ work • Potential legal consequences • Other tools support such approaches: http://mymemory.translated.net/doc/ • Ownership, attribution • Familiar issues • Potentially useful innovative technology falls down because it fails to take into account practical user-based scenarios, in part due to inadequate ethical framework EM+/CNGL workshop

  15. 2. TAUS Language Search Engine (LSE) • Online tool for searching uploaded TMX data • Parallel concordances, word alignment techniques • Intelligent dictionary • User (mis)expectations • Ethical framework is explicit – even a ‘model’ • User consent • Quid pro quo • Data owner responsibility EM+/CNGL workshop

  16. But key questions remain unaddressed • Ethical, not technical • Ownership and consent – broader issues • ‘Community of users and providers of translation technologies and services’ – but all large-scale, not end users or freelance translators • Informed consent? • NB not legal/contractual - broader • Industry codes of ethics, ‘taking credit for others’ work’ • UNESCO 1976, ‘supplementary payment’? EM+/CNGL workshop

  17. Key questions unaddressed • Translator choice? • Should ultimate responsibility afford claims to ultimate ownership? • Avoiding harm? • Effects on future translation quality judgments? EM+/CNGL workshop

  18. Positively ethical • The aims and ambitions of these two initiatives can be seen as profoundly ethical • Relevant principles in codes: • Professional review, informed critiques, raise standards, improve public understanding, contribute to society and human well-being, respect human diversity, support fellow professionals, contribute to profession’s standing, enhance quality of life • Not just defensive, but allows case to be made for action rather than inaction EM+/CNGL workshop

  19. Questions/Discussion EM+/CNGL workshop

More Related