1 / 15

Data-Driven Machine Translation for Sign Languages

Data-Driven Machine Translation for Sign Languages. Sara Morrissey PhD topic NCLT/CNGL Workshop 23 rd July 2008. outline. background main problems data-driven MT for SLs experiments and results conclusions. background. communication interpreters and technological aids

hayley-hill
Download Presentation

Data-Driven Machine Translation for Sign Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-Driven Machine Translation for Sign Languages Sara Morrissey PhD topic NCLT/CNGL Workshop 23rd July 2008

  2. outline • background • main problems • data-driven MT for SLs • experiments and results • conclusions

  3. background • communication • interpreters and technological aids • machine translation • automatic and confidential • native language of users • rule-based approaches (Veale et al., 1998, Marshall & Sáfár, 2002) • data-driven approaches • Bauer et al., 1999, Stein et al., 2006, Wu et al., 2007

  4. main problems • representation no formally adopted writing system • linguistic analysis little research • appropriate data difficult to find • evaluation visual-spatial nature rules out automatic

  5. data-driven MT for SLs • initial prototype system using Dutch SL • MaTrEx system • Air Traffic Information System (ATIS) Corpus • 595 English sentences • multi-lingual – ISL parallel corpus creation • manual annotation with semantic glosses

  6. data representation (Early morning flights between Cork and Belfast) EARLY MORNING BETWEEN be-CORK CORK FLY BELFAST BETWEEN ref-BELFAST ref-CORK

  7. MATREX: data-driven machine translation English  ISL •  bilingual database

  8. translation directions Spoken Language Text  SL Recognition SL Generation SL Annotation

  9. experiments and results • machine translation experiments • 2 segmentation methodologies • type 1 chunks uses Marker Hypothesis (Green, 1979) • type 2 uses dual segmentation method • Early morning flights between Cork and Belfast • <ADJ> early morning flights <PREP> between Cork <CONJ> and Belfast

  10. experiments and results

  11. animation • real human signing preferred (Naqvi, 2007) but impractical • avatar animation • criteria: realistic, consistent, functional, fluid • Poser Animation Software Version 6.0 • 50 randomly selected sentences, 66 hand-crafted videos • problem of fluidity

  12. animation ‘or’ how much ‘e’ flight http://www.computing.dcu.ie/~smorri/ISL_AnimationDemo.html

  13. experiments and results • human evaluation experiments • 4 native Deaf human monitors • web-based evaluation of 50 ISL translations • evaluated intelligibility and fidelity • 82% animations = intelligible • 72% animations = good-excellent translations • HCI analysis using Nielsen’s approach

  14. conclusion • MT methodology never before applied to SLs • multi-component system, bidirectional system • practical, technological alternative to help alleviate communication and comprehension for Deaf community • positive automatic and manual evaluation • scope for incorporating different SL representation methodologies and segmentation techniques

  15. thank you questions? smorri@computing.dcu.ie

More Related