340 likes | 362 Views
Learn how to excel in post-editing to work faster, improve quality, and increase earnings in the translation field. This presentation explores the adoption, skills, tools, and training needed for successful post-editing.
E N D
Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist pcamargo@blc.com.br BLC - Brazilian Localization Company Web: www.blc.com.br
Purpose of this presentation • Promote the adoption of Machine Translation (MT) and Post-editing (PE) • How we can work faster, better, and make more money • Target-audience: • Novice, experienced, and advanced free-lance translators • Small LSPs and in-house translators
Perspectives: novice translator • Introduce PE as a new profession • Background information • Current adoption of PE • PE productivity/compensation • Explore availability of PE training • Why a translator need PE training • What are the required skills • PE certifications available: TAUS, and SDL
Perspectives: experienced/advanced • Use MT output as translationaid • Research shows MT increases productivity • Translators prefer MT insteadofunaided • GT, SDL Cloud, MS Hub, Systran • Advanced: combine MT / term. manag. • Term extraction/customization of MT • Generation/PE of MT output ↑ productivity • Replace combined TM/on-line TM servers?
Perspectives: small LSP • Large/medium LSPs use MT (> decade) • Small LSPs: need to catch-up • How to get started with low budget • MT developments: ↓ needofspecialized IT • Key-resource: in-house translator • Terminology management • Customizable MT • Preliminaryanalysis/PE Guidelines
Definition of post-editing (TAUS) • Post-editing: “the correction of machine-generated translation output to ensure it meets a level of quality negotiated in advance between client and post-editor”. • “Post-editing seeks the minimum steps required for an acceptable text”
Background information • PE reality: driven by advances in MT • Hybrid MT: rule-based / statistic-based • Rule-based: dictionaries, rules; e.g Systran • Statistic-based: training data (TM); e.g GT • Pre-editing • Customization: glossary, training data • Preliminary analysis: language rules, client rules, example card
Preliminary analysis (Rico, 2011) • Afterenginecustomization • Select MT samples • Checktermconsistency/accuracy • Check for recurrent MT errors • Draw guidelines (quality acceptance) • Quality/errors to expect / how to proceed • Language independent/dependent rules • Feedback (glossary update, errors report)
Guidelines for PE (Rico, 2011) • Language independent rules • Fix terminology, syntactic, morphology • Fix misspelling, punctuation, omissions • Edit offensive/inappropriate text • Language dependent rules • Language specific examples: • Example card: expected errors/how to fix
Custom MT: what to expect (O’Brien, 2002) • Custom MT: high-level MT output • Most segments = 85% TM fuzzy • Some better than 100% TM match (review) • Some bad translations: retranslate • Translator: critical to MT success • Need human assessment: always! “Not only will MTPE not replace the translator but it also will not happen without the translator”
Full MT post-editing (Dillinger, 2004) • Goal: human-quality output • Most frequent: higher visibility texts • Quality expectations: high (TEP) • Grammar, syntactic, semantic correct • Stylistically appropriate • Productivity expected: 4K – 10 K w/day
Current adoption of post-editing • Common Sense Advisory report (2012) • Freelance: 21.7% (15.4% plan) • Small LSP: 32.5% (22.6 % plan) • Large LSP: 72.0% (28.0% plan) ** • ALC report (2015) • Small LSP: 20.0% (USA), 25.0% (Europe) • Lionbridge (Marciano, 2015) • Apply 30% projects (goal 50%), 60M, 2014
Post-editing productivity data • Post-editing productivity (O’Brien, 2006) • Equal/higher than editing TM High Fuzzy • Typical: 4K to 10K words/day • Proficiency: 100K w (1 month full-time PE) • Other productivity data • Full PE: 5K–8K w/day (DePalma, 2011)
PE compensation: follow TM fuzzy • TM fuzzy matches (Guerberof, 2013) • 60-66% of full TR rate for 75%-94% match • MT full post-editing • 70-50% of rate (Guerberof, 2013) • 65-68% of rate (Marciano, 2015) • Smaller companies: prefer to pay per hour
Proposal for PE training (O’Brien, 2002) • PE: what do TRs think about? • Dislike for correcting repetitive errors • Fear of losing proficiency (poor MT output) • Dislike for limited freedom of expression • Why do TRs need PE training? • ≠ skills: 2 source texts • Quality requirements, different error types • Qualified translator ≠ successful post-editor
What skills does a post-editor need? • Same as the translator (O’Brien, 2002) • Expert in subject area and target language • Excellent knowledge of source language • Word-processing (WP) skills, tolerance • Skills for post-editor only (Rico, 2011) • Adv WP: RegEx, S&R, term. management • Positive attitude towards MT
Proposal for PE course (O’Brien, 2002) • Theoretical component • Intro to PE/MT tech / controlled language • Adv. term. Management / text linguistics • Basic programming skills • Required background • TRA skills; basic linguistics/term manag • IT skills; intro lang tech; source/target skills
Sources for PE certification • TAUS (Transl. Automation User Society) • English >23 lang (European, Arabic, Asian) • Also Spanish > English • Cost: 60 Euro (member), 80 Euro (non) • SDL MT PE Certification • Free with SDL Language Cloud MT
Perspectives: experienced/advanced What possibilities can MT offer other than post-editing? Is it worth using MT output as an aid to increase TR productivity? Can MT replace with advantages the use of combined TM/on-line TM servers?
Efficiency of PE for language translation • Rigorous, controlled analysis (Spence, 2013) • Hypothesis 1: PE reduces translation time • Hypothesis 2: PE increases quality • Hypothesis 3: MT primes the translator • Compared PE vs. unaided translation • Blind experiment: TR did not GT was used • Pre-interview: TR showed strong MT dislike • 16 PRO TRA/pair: EN-AR, EN-FR, EN-GE
Results clarify value of post-editing • Which one is faster? 69% PE • Useful? Yes 56%, No 29%, Unsure 15% • Suggestions improved quality(all) • MT output primes the translator • PE text (closer MT) ≠ Unaided ≠ Raw MT • Lower the TR experience → closer to MT
Does MT output increase productivity? • Example 1: Google Translate • Now a paid service: $20/M characters • Plug-in to SDL Trados/other CAT tools • General statistical MT engine • Not customizable • Confidentiality issues • See app for complete setup procedure
Does MT output increase productivity? • Example 2: SDL Cloud MT • Price range: $5 – $75 /month (Expert) • Plug-in to SDL Trados/other CAT tools • Complete confidentiality (nothing is stored) • Pre-trainedengines: Travel, IT, Life Sciences, Automotive, Consumer Electronics • Customizable MT: can add own glossaries • Comprehensive analytics (quality analysis)
Does MT output increase productivity? • Example 3: Microsoft Translator Hub • Plug-in to SDL Trados/others, secure • First 2M char free; 4M/mon $40 • Fully customizable MT engine • Previous translations (> 20K words) • Add glossaries • Request training / evaluate results • Option to “Use Microsoft Models”
How about confidentiality? • Consider e.g. Microsoft and Google • Among the largest providers of MT • Among the largest buyers of translation • Control information flow around the globe • Confidentiality should not be problem • Google not option → MS Hub/SDL Cloud • Uncomfortable sending data to MS/SDL • Use desktop/server solution: Systran
Changes in MT offer to TRs • Common scenario for TR (4 years ago) • One affordable desktop product (Systran) • Macros, RegEx, format conversion • No plug-in for CAT tools (high-end) • Current scenario • Software as a service (GT, SDL, MS Hub) • Plug-in for CAT tools is standard • Much lower IT requirements
Can MT replace combined/online TMs? • Experienced/advanced translators • Use combined/on-line TM for productivity • Proud users: ↑ 50% prod, see TM as asset • TM is error-prone (consistency, mistranslation) • Need to check term consistency • MT improved a lot in last 5 years • TRs trust TM fuzzy > raw MT (Guerberof, 2008) • Mistake MT output for TM output (human?)
MTPE can provide a better result • Avoid problems in combined/on-line TM: • Terminology inconsistencies • Mistranslations • Waste time correcting TUs that will never use • New approach using MTPE • Extract terms (Systran, rule-based) • Customizable MT (SDL Cloud or MS Hub) • Post-edit fresh MT for ↑ productivity/quality
How small LSPs can get started • Scenario: 40% TRs use MT (TAUS) • Actual post-editing offer (2015) • PE of our MT engine output (GT?) • Payment: 900 words/hour • Instruction: as readable as possible • No pre-editing: cal, gloss or guidelines • Translators were really upset
Need more than just Google Translate • Pre-editing: custom., prelim. analysis • Key: in-house translator (O’Brien, 2002) • Allocate translator for MTPE activities • Use secure on-line customizable engines • Define suitable projects • Invest in terminology management • Develop PE guidelines • No rate discount initially (learning curve)
MT implementation at BLC (4 years) • Smaller projects: 10 - 50K words • Terminology extraction (Systran, rule-based) • Normal TR + ED procedure • Semi-customized: SDL Cloud + Multiterm • Larger projects > 50K words • Extract bigger glossary (higher coverage) • Raw MT (Systran, SDL Cloud, MS Hub) • PE + ED (no pre-editing, no discount)
MT implementation at BLC • What does MT do for BLC? • Leverage my knowledge: engineering/science • Increase productivity (more/larger projects) • Increase quality (terminology/TM updates) • Future developments • MS Hub, SDL Cloud; Systran • Hire new translator (2016) • Develop a PE team/service
Conclusion The combination of machine translation (MT) and post-editing (PE) is a disruptive innovation that can improve translator’s productivity and translation quality, no matter how you plan to use it. Can you afford to ignore it?
References • Guerberof , Ana (2008). Productivity and Quality in the Post-editing of Outputs from Translation Memories and Machine Translation. Masters Dissertation. Universitat Rovira i Virgili. • O’Brien, Sharon (2006). Eye-tracking and Translation Memory matches. Perspectives: Studies in Translatology 14 (3), 185-205. • Spence, Green et al (2013), The Efficacy of Human Post-Editing for Language Translation, ACM Human Factors in Computing Systems (CHI), Computer Science Department, Stanford University • Rico, Celia et al (2011), EDI-TA: Post-editing Methodology for Machine Translation, MultilingualWeb-LT. • O’Brien, Sharon (2002), Teaching Post-editing: A Proposal for Course Content. Proceedings of the 6th EAMT Workshop on Teaching Machine Translation. EAMT/BCS, UMIST, Manchester, UK. 99-106. • DePalma, Donald (2011), Common Sense Advisory, Trends in Machine Translation. • Dillinger, Mike et al (2004), Implementing Machine Translation, LISA Best Practice Guides. • TAUS (2014), MT Post-editing Guidelines • Marciano, Jay (2015), Personal communication.