150 likes | 274 Views
Computer Parsed Oral Speech Recognition and Assessment October 2010. The Speaking Craze. Rapport Headstart2 AFPAK Hands Language Enabled Special Operations Foreign Area Officers/Regional Area Specialists/Regional Area Officers Interrogators Civil Affairs. The Assessment Challenge.
E N D
Computer Parsed Oral Speech Recognition and AssessmentOctober 2010
The Speaking Craze • Rapport • Headstart2 • AFPAK Hands • Language Enabled • Special Operations • Foreign Area Officers/Regional Area Specialists/Regional Area Officers • Interrogators • Civil Affairs
The Assessment Challenge • Desire to provide incentives • Level 0+/1 Listening/Reading/Speaking • DLPT/ Very Low Range for Listening and Reading • But speaking and participatory listening are the desired skills • How to assess?
The Demand • About 9,000 per year • Projected demand: 20,000 to 40,000 • Gold standard:Face to Face, 2-rater/3rd rater • Double rater • Single rater • OPIc /ACTFL • Versant/Ordinate
Everyone wants a speaking test, but the numbers of testers, time, modality, and dollars set limits Can the computer help?
Machine Translation Evaluation and ASR at DLIFLC • Jibbigo • Phraselator • Automated Speech Recognition (ASR) Development • DLIFLC – MIT-LL (Massachusetts Institute of Technology Lincoln Laboratories) collaboration
Machine Translation Devices • Jibbigo (iPhone-based) • 2W, S2S – Two-Way, Speech-to-Speech, Free Speech • + Easy to use; works well with basic vocabulary; created for humanitarian missions • - Longer phrases, sentences are not recognized well; military vocabulary hardly recognized; abbreviations and proper nouns (including names of persons and places) are not recognized and generally mistranslated
Machine Translation Devices • Phraselator • 1W, S2S – One Way, Speech-to-Speech, Phrase-based • + Ruggedized body; 75-80% success with English automatic voice recognition of onboard phrases • - Works only with a limited number of pre-recorded phrases: about 2,500 phrases per language; output only in target language (no input); screen hard to read; limited memory and functions as it is based on a PDA, Windows CE Operating System
Machine Translation Devices • 2W, S2S - IBM MASTOR
Machine Translation Devices 1W, T2S - VCOM3D
Hanscom AFB / MIT-Lincoln Lab - ASR • Development of an Automated Speech Recognition (ASR) System for Arabic learners • Concept of Implementation • First version - ASR beta - to operate in 6 months • Local programmer will be trained by MIT-LL to support, operate, develop, and maintain DLIFLC ASR system • DLIFLC local data collection will support development and control costs • Cost of DLIFLC ASR programmer min. 85-95K /year • Cost of licensing for each language as projected by a contractor, for example, was $1.2m to develop with 75% licensing fees of initial costs per year thereafter
Hanscom AFB / MIT-Lincoln Lab - ASR • Development of an Automated Speech Recognition (ASR) System for Arabic learners • Operation • System will recognize targeted words spoken by students • System will give real-time feedback on pronunciation errors • System will score level of correctness (beta precursor to automated OPI testing) • System will enable and enhance existing DLIFLC language learning products and curriculum • System can be expanded to support additional languages
ALELO Tactical Dari Demonstration (No Endorsement Intended or Implied)
Summary • Technologies have potential • Self-teaching software should have a built-in assessment • Commercial and government products have some very good applications and potential • DLIFLC is exploring several options • Potential exists for low level assessment