120 likes | 259 Views
DEPT, FERRARI and MENDELOVITS: How to analyze and exploit field test results with a view to maximizing cross-language comparability of main survey data collection instruments. Washington D.C., March 23, 2012. State of the Art. Proliferation of multilingual comparative studies
E N D
DEPT, FERRARI and MENDELOVITS: How to analyze and exploit field test results with a view to maximizing cross-language comparability of main survey data collection instruments Washington D.C., March 23, 2012
State of the Art • Proliferation of multilingual comparative studies • Survey design includes a pilot, or Field Test (FT) carried out on a smaller scale • items adapted in multiple languages before FT • key moment for linguistic quality control (LQC) : right before FT – translation verification • Comprehensive documentation of adaptation, adjudication and validation processes
Between Field Test and Main Survey • In survey designs that include FT and MS, analysis of FT results = a wealth of information • Can be used to inform item selection • But also to perform a more focused linguistic and formal verification before MS • Open communication channels between item writers, national experts, verification coordinators
The PISA paradigm • Inception in 2000, currently 5th survey cycle • Double translation, double source design • 32 national versions (2000) -> 85 n. v. (2012) • from pencil and paper to computer-delivered assessments and background questionnaires • compiling data on adaptation history of each item in each language
Analysis of FT Results • At item level: item stats (itanals) • Item discrimination • Item fit • Ability ordering • Point biserial correlation (MCQ) • Differential item analysis • gender • country • language
Multiple choice item: not dodgy Mean ability and standard deviation for the group of students who selected responses A, B, C or D Higher than 0.2 Item fit Should be negative for distractor Options A,B,C,D Key Answer Should be positive for key answer
Multiple choice item: Dodgy Less than 0.2 Low discrimination between high and low achiever Value significantly higher than 1 (item discrimination between high and low achievers is less than expected)
Action • Dodgy item reports sent • to countries • to cApStAn • reflect on the data, examine national version; explain why such results may have occurred. • As a result, FT to MS corrections proposed by: • Item writers / test developers • countries / verifiers
MS version management • Base national MS version prepared for countries (using final FT version) • segment status indicates type of action • locked segments if no FT > MS changes • Country review followed by focused verification • Difference reports (before/after) generated automatically • Reports examined by referee • Final check on key corrections
CHALLENGES • Convincing reviewers / verifiers thatif it isn’t broken, don’t fix it • Document each change with its justification • Check whether changes have not introduced new errors or inconsistencies • Make more systematic use of dodgy item reports, including for background questionnaires • Embed these processes in the platforms and IT adaptation management systems
steve.dept@capstan.be Any Questions? Thank you