Frans Kleintjes, Cito

Two decades of Applications of Item Response TheoryStarting up, benefits, expectations, deceptions and developments. Frans Kleintjes, Cito

Relevant Numbers • 1953 • 1981 • 1988 • 500 • 20 • 3

Relevant Numbers • 1953 My year of birth • 1981 Masters Application of a LLTM • 1988 Started with Cito- PRC • 500 Number of employees Cito • 20 Number of employees PRC • 3 Man-years research PRC

Expectations • 1953 Dr. Frederic Lord: • Ability scores are test independent • Estimated ability scores to be independent of choice of test items • Item and test characteristics to be sample independentLord, F.M. (1953) The relation of test score to trait underlying the test EPM 13

(My) Applications of IRT • Cito Reading Index Primary Education • Large scale applications School leaving test primary education Entreetoets 7 • International consultancy

Cito Reading Ability

Item

Text 04

Text 20

Text 43

Large scale applications • 1988=> National assessment program PPONfive year cycle; 16 topics ; sample based • 1986=> Student Monitoring System primaryeducationper topic two tests each year; pre internet: computer program; used by almost all schools • 1993-2002 Basic Educationage 15; 70 tests per year; each test in thee levels; all subjects covered; In 1999 standard setting program for all subjects based on bookmark method using IRT (=OPLM) results

Large scale applications • 2002=> Student Monitoring System secondary education:Allocation+ monitoring progress Four tests, each in thee levels; 5 subjects; used by one third of schools; reporting is internet based • 2008=> Test for children with special needsIRT equating and reporting on scale of ‘regular’ primary education:, Improvement of itemconstruction

Large scale applications School leaving test primary school (age 12) • Purpose: Advise on track in secondary education, School (self)evaluation; • High stake, 145 000 students (85% of population) • New test each year with 200 items: language, maths, social science; 11 topics. • Equate total test score over years to report ‘standard score’ using pretest-design

Large scale applications School leaving test primary school (age 12) • All items are pre-tested twice in incomplete design; 23 booklets 180 items per booklet; (about 2000 in one pretest) from 2003 IRT based • Deception: unable to predict test characteristics for all topics from pretest, we still need the results on the test. Effect of pre-testing for high stake testing • IRT is used to equate related tests: ‘catch up test’ ; easier version; CBT version;embedded anchortest;to relate this test to other tests

Entreetoets Groep 7 (at age 11) • Purpose: to provide overview of student and school achievement (profile) • Language; Maths; Social science: 450 items, 13+3 topics, 130000 students (75% of population) • Renewed every 5 years • ‘Embedded’ field test in 2009 (avoiding pretest effects) • 9 versions, each 200 ‘old’ 250 ‘new’ items => optimal design • IRT ‘OPLM hammer’ to equate these versions to the ‘Entreetoets 7’ by topic for reporting • Evaluation in 2010

Thank you, • Questions ?

Frans Kleintjes, Cito

Frans Kleintjes, Cito

Presentation Transcript

Frans Lanting

Frans de Waal

Frans Blanker

Frans de Waal

Frans Van Schooten

Frans joziasse

Frans Bastiaens February 2003

Frans Snik Sterrewacht Leiden

Frans Lefeber

Frans Aarts