140 likes | 281 Views
Assessing the Usability of Machine Translated Content: A User-Centred Study using Eye Tracking. Dr. Stephen Doherty & Dr. Sharon O ’ Brien Centre for Next Generation Localisation School of Applied Language & Intercultural Studies Dublin City University. Outline . Introduction Research Aims
E N D
Assessing the Usability of Machine Translated Content: A User-Centred Study using Eye Tracking Dr. Stephen Doherty & Dr. Sharon O’Brien Centre for Next Generation Localisation School of Applied Language & Intercultural Studies Dublin City University
Outline • Introduction • Research Aims • Methods • Results • Conclusions
Introduction • Increased need for translation • Diversity of content and users • Rise in prevalence of machine translation [MT] both off- and online • Mixed reports of quality – attitudes and expectations • Divergence in R&D – translation studies/computer science • Evaluation metrics – human and automatic • Our focus here is on usability
Research Aims • To investigate if there are differences in usability between the English [source language] and the unedited machine translated target languages [FR, DE, SP, JP]. • Or in other words: how usable is machine translated content? • Adoption of the ISO/TR 16982 definition of usability • Importance of ecological validity: real materials and users
Methods • User-centred approach [n = 30]; task driven – ‘new user’ scenario • Eye tracking [tobii 1750]: • Fixation count and average duration • Attentional shifts; percentage time in each window • Textual regressions
Methods • Post-task questionnaire; five-point Likert • Comprehension • Task completion • Potential improvement • Future reuse • Recommendation • Recall
Methods • Usability • Satisfaction • Efficiency [task success/task time]
Eye Tracking • Task time • Lowest for EN [sig. JP] • Fixation count and average duration • Lowest for EN [sig. JP] for both • Attentional shifts; percentage time in each window • EN and FR spent most time in task window • EN fewest shifts of attention [sig. JP] • Textual regressions • Raw number and distance: EN and SP [sig. JP] • ‘Long’ regressions: JP [sig. all others]
Questionnaire Results • Comprehension • EN rated highest [sig. for FR and JP] • Task completion • EN rated highest [sig. for JP] • Potential improvement • SP & EN rated as needing least improvement, but could still be improved upon • Future reuse • FR & EN rated highest • Recommendation • EN rated highest [sig. for JP and DE] • Recall • EN scored highest [sig. for JP and DE]
Usability Results • Satisfaction • EN rated highest [sig. for FR, DE, and JP] • Task completion • EN and SP more successful [sig. JP] • Efficiency • EN most efficient [sig. JP and DE]
Conclusions • So, just how usable is raw MT? • Similar results for EN, SP, and FR • DE and JP more problematic [MT system] • Functionally usable [more than just ‘gisting’] • UX best for EN users • MT viable for certain pairs • Human intervention necessary to ensure best UX
Questions? stephen.doherty@dcu.iesharon.obrien@dcu.ie This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation (www.cngl.ie) at Dublin City University.
Predictors of Positive UX • Satisfied users: comprehension & task time • Satisfied users: recommend to others • Task completion: textual regressions • Cognitive effort: instructions aiding task completion