240 likes | 580 Views
Automatic Assessment of Spoken Modern Standard Arabic. NAACL Boulder, Colorado 5 June 2009 Pearson Knowledge Technologies Palo Alto, California Jian Cheng Jared Bernstein Ulrike Pado Masa Suzuki. Outline. Pearson Knowledge Technologies How Versant tests operate
E N D
Automatic Assessment ofSpoken Modern Standard Arabic NAACL Boulder, Colorado 5 June 2009 Pearson Knowledge Technologies Palo Alto, California Jian Cheng Jared Bernstein Ulrike Pado Masa Suzuki
Outline • Pearson Knowledge Technologies • How Versant tests operate 2. Versant Arabic Test (development) 3. Validation evidence 4. Predictive accuracy
Pearson Knowledge Tech. (PKT) (KAT + Ordinate) are now PKT KAT ≈ {LSA, Essay Scoring, Write-to-Learn, PTE, etc.} Ordinate ≈ {Versant, ORF for NCES, VersaReader, PTE, etc.) PKT is part of Pearson Pearson ≈ { FT, Economist, Penguin, Longman, PsychCorp, … etc} PearsonKT is in Boulder, Colorado and Palo Alto, California.
Test delivery Scoring system ENGLISH speech Database tests, prompts, responses ARABIC Delivery Interface Communication Network DUTCH report SPANISH California Anywhere
How Versant tests operate “The train’s been delayed by one hour ” Test Delivery Server Versant Database Scoring
Versant Arabic Test • DLI purpose • ~1000 students at DLI need predictive speaking tests • Requirements • Accurate test of Arabic listening & speaking • Convenient to use at DLI and worldwide (ILR is costly) • Suitable for repeated formative testing • High peak capacity for mass screening
Construct Comparison OPI Construct:Oral Proficiency as manifest in an Oral Proficiency Interview, is compatible with communicative competence as reflected in the functional level and/or complexity of content accurately produced. VersantConstruct: facility in spoken language–the ability to understand spoken language and speak appropriately in response at a conversational pace on everyday topics.
Versant Arabic Test Test Structure Part A: Reading Part B: Repeat -1 Part C: Short Answers Part D: Sentence Builds Part E: Repeat -2 Part F: Passage Retelling
20% 30% 30% 20% Fluency Sentence Mastery Vocabulary Pronunciation HumanScoring Read Repeat Sentence 1 SAQ Sent Build Repeat Sentence 2 Passage Versant Scoring
How Versants are developed (1) ScaleEstimates NativeJudges scale scores Criteria Internal Ordinate System Versant Scores NativeScribes transcripts Validation (Versant Arabic Test) External Recorded Items Item Text Arabic Natives ILR Scores Concurrent ILR Interviews Arabic Learners Native TestDevelopers Test Spec
kutubu al-waladi– the books of the boy kataba al-waladu – wrote the boysubj No disambiguating short vowels written Vowels carry phonetic information Vowels carry grammar information Arabic Challenges: Voweling
forvisitof us – for our visit Complicates lexicon lookup, frequency estimates… “Short” Arabic items are harder than English items with the same number of words Complex Morphology naa ziyaarat li
Development & Run-time Processes Compilation of expectation and runtime flow
Training data sources Prompt Voices and Training Samples
Reliability: Scores are consistent Validity: Native and non-native speakers should be clearly distinct MSA and dialect speakers should be distinct(since we’re testing MSA) Machine scores should predict human scores Validation Criteria
Educated ~ Uneducated Speakers CumulativeDensity Arabic Overall Score
How Versants Compare to OPIs ILR OPI Score (logits) N = 118 r = 0.87 Versant Arabic Overall Score
ILR OPI Score (logits) N = 37 r = 0.92 Versant Spanish Score Spanish & English: Versant ~ Human Spanish English N = 37r = 0.92 N = 151r = 0.86
Summary • Versant Arabic Test (VAT) is in operation • Based on a large and wide body of transcribed spoken material • VAT is available on demand • Returns consistent, accurate scores that reflect real-time skills with MSA • VAT can triage or screen for OPI tests
النهاية Thanks to Waheed Samy, Naima Bousofara Omar, Eli Andrews,Mohamed Al-Saffar, Nazir Kikhia, Rula Kikhia,and Linda Istanbullifor item development and data collection/transcription in Arabic,and to Andy Freeman for providing diacritic markings.