Horn Dániel MTA KTI és ELTEcon horn @ econ.core.hu Egészség és Munkaerőpiac konferencia

Csalás a kompetenciamérés tesztjeinempirikus kísérlet a kompetenciamérés megbízhatóságának tesztelésére(előzetes eredmények) Horn Dániel MTA KTI és ELTEcon horn@econ.core.hu Egészség és Munkaerőpiac konferencia Szirák, 2011. november 4-5.

tartalom • A Jacob-Levitt módszer rövid bemutatása • A kompetenciamérésről röviden • A csalási arány becslése – J-L módosításával • Robosztussági tesztek (Megj.: eredmények nem véglegesek!)

A Jacob-Levitt módszer

RottenApples • Jacob és Levitt 2003, QJE • Chicagoi általános iskolák • 1993-2000 • 3-7. évfolyam • Iowa Test of Basic Skills • csak feleletválasztós kérdések (multiplechoice) • tanárok „javítják” • Két indikátor a csalásra: • 1. indikátor – nagy tesztpontszám változás (large test scorefluctuation) • 2. indikátor – gyanús válaszadási minták (suspiciousanswerstrings) • 4 mérték (4 measures – M1, M2, M3 és M4)

1. indikátornagy tesztpontszám változás ahol rankgaina c osztály, b teszten t évben elért percentilis-rang változása (percentile rank increase)

2. indikátorgyanús válaszadási minták • M1 • annak a valószínűsége, hogy diákok egy csoportja egymás után következő kérdésekre ugyan azt a választ adja. Minél kisebb a valószínűség, annál valószínűbb, hogy csalnak. • (probability of blocksofidenticalanswerstrings)

2. indikátorgyanús válaszadási minták • M2 (átlag) és M3 (szórás) • osztályszintű aggregált statisztikája annak, hogy mennyire volt váratlan az adott kérdésre adott válasz.

2. indikátorgyanús válaszadási minták • M4 • a helyes válaszok aránya képességszintenként. ennek eltérése az országos átlagtól.

2. indikátorgyanús válaszadási minták • A 2. indiátor a mértékek rangérték-négyzetének összege (majd ennek a rangsora).

Intuíció

Intuíció kritikája • Az 1. indikátorban lehet első és másodfajú hiba is • A 2. indikátor sorba rendezésével elveszítjük a legfontosabb információt: a kilógó eseteket. • Vagyis szükség volna egy küszöbértékre, ahol a „nem csaló” társadalom állna ugyanezen mértékek alapján.

Országos Kompetenciamérés

Csak 2 év panel • Nem csak feleletválasztós kérdések • Nem a tanárok javítják • ItemResponseTheory!

A csalási arány becslése – némi módosítással

ItemResponseTheory • Ahol a – discrimination, b – difficulty, c – pseudoguessing • ez a három paraméter minden itemre adott. • és

Folyamat • Az IRT által adott függvény segítségével: • kiszámoltuk diákonként, hogy mekkora valószínűséggel válaszol helyesen az adott itemre • ha ez a valószínűség nagyobb volt mint egy random szám (0 és 1 között) akkor az új adatbázisban helyesen válaszolt • ha nem, akkor helytelenül • a helytelen válaszok megoszlását – szintén random módon – a populáció megoszlásához igazítottuk.

Folyamat • Az új adatbázisra kiszámoltam az M1, M2 M3 és M4-et. • ahol az eredeti mérték szignifikánsan nagyobb mint az új, ott feltehető a csalás (2008/8, matek):

Robusztussági tesztek

Csalás vs. tesztpontszám változás

Csalás vs. alulteljesítés

Konklúzió helyett

További teendők • Más évekre is megcsinálni ugyanezt (2008/6, 2010/8 és 2010/10 lehetséges még) • Nem csak matekra, hanem olvasásra is • és beépíteni az itt kapott javaslatokat…

Köszönöm a figyelmet! horn@econ.core.hu

Indicator 1Large test scorefluctuation where rankgainis the percentile rank increasefor class c in subject b in year t

Indicator 2Suspiciousanswerstrings • Measure 1 (M1) • estimatestheprobability of each answer in each itemfor each student where Y is the response for student s in class c on item i.Jis the number ofpossible responses (four), X is a vector of student characteristics, that includes past and futuretestscores, and somebackgrounddata (free lunch, gender and race)

Indicator 2Suspiciousanswerstrings • Measure 1 (M1) 2) Calculatestheprobabilityforeachstudentfortheanswer s/he actuallygave where k is theresponsethestudentgaveonthespecificquestion • Calculatethisprobabilityfor a (large) set of consequtivequestions, fromitem m toitem n

Indicator 2Suspiciousanswerstrings • Measure 1 (M1) • Taketheproduct of thisacrossallstudents, who had thesameresponsesforthegivenset of questions • Finally, takethe minimum of theseprobabilities

Indicator 2Suspiciousanswerstrings • Measure 2 (M2) • Calculatetheresidualforeach of thepossiblechoices a studentcouldhave made foreachitem response j on item iby student s inclassroomc. fourseparateresiduals per studentperitem

Indicator 2Suspiciousanswerstrings • Measure 2 (M2) 2) sum the residuals for each response across students within aclassroom (fourmeasures per classroomperitem) Thismeasure is closetozeroifthereis no withinclasscorrelationacrossstudentsin a givenitem That is, ifstudentsrespondedthesamewayto an item, thismeasure is veryhigh.

Indicator 2Suspiciousanswerstrings • Measure 2 (M2) 3) take sum of squaresacrossthefourpossibleresponsesforeachitemforeachclassroom, and normalizebyclasssize 4) taketheaverage of thiswithinclassroom (and dividebythenumber of items)

Indicator 2Suspiciousanswerstrings • Measure 3 (M3) thethirdmeasure is simplythevariance (asopposedtothemean) of thesamestatistic M2 might be largeduetoteachingdifferences, e.g. teachermightemphasize a giventopic more. „If the teacher changes answers for multiplestudents on selected questions, the within-class correlation onthose particular questions will be extremely high, while the degreeof within-class correlation on other questions is likely to betypical. This leads the cross-question variance in correlations tobe larger than normal in cheating classrooms.” Note: this is alsotrueif a teacheremphasizesatopic more throughtheyear

Indicator 2Suspiciousanswerstrings • Measure 4 (M4) 1) calculate whereqiscequaloneif student s inclassroom c answered item i correctly, and zero otherwise.As is theaggregatescore of students, and zdenotes a givenscorelevel, whilensAdenotesthenumber of studentswith an aggregatescore A. Thisshowsthefractionof students at each aggregate score level, whoansweredeachitemcorrectly

Indicator 2Suspiciousanswerstrings • Measure 4 (M4) 2) calculate a measure of how much the response pattern ofstudent s differed from the response pattern of other studentswith the same aggregate score 3) subtract out the mean deviation for all students with thesame aggregate score, ZA, and sum the students within eachclassroom to obtain thefourthindicator

Indicator 2Suspiciousanswerstrings Indicator 2 is the sum of squaresoftherankvalueofthesemeasures The Jacob-Levittestimates:

Horn Dániel MTA KTI és ELTEcon horn @ econ.core.hu Egészség és Munkaerőpiac konferencia

Horn Dániel MTA KTI és ELTEcon horn @ econ.core.hu Egészség és Munkaerőpiac konferencia

Presentation Transcript

Historical Perspective

Chapter 7

COMP313A Programming Languages

The Volcker Rule: The Agencies’ Proposed Rules Charles M. Horn Oliver Ireland November 21, 2011

Econ 522 Economics of Law

BRUDD og LUKSASJONER av VRISTBEINET

Toward Scalable Transaction Processing

1. When a car ahead has stopped to allow a pedestrian to cross the street at a marked crosswalk, you must:

OBJECTIVES

ECON 202 MIDTERM 1

Core Strength Training: Tier I for All!

Bewertung, Bilanzierung und Betriebsüberleitung

Dolphin SIMILES

MSA - ALTAIR 4X Gas Meter

KATONAI ALAPISMERETEK KONFERENCIA

Programming Languages Third Edition

Promoting your Institutional Repository on and off campus

Complexity and pain

Physiographic Map of Africa

History

13: Final Review Intro Econ

Econ 240 C