170 likes | 281 Views
Item pocket method to allow response review and change in CAT. Kyung T. Han http://www.linkedin.com/in/hantest. Response review. Aim to reduce examinee’s anxiety during high stakes test. But make CAT less efficient and biased score estimates. Examinee’s test-taking strategies
E N D
Item pocket method to allow response review and change in CAT Kyung T. Han http://www.linkedin.com/in/hantest
Response review • Aim to reduce examinee’s anxiety during high stakes test. • But make CAT less efficient and biased score estimates. • Examinee’s test-taking strategies • Wainer strategy • Kingsbury strategy • Generalized Kingsbury (GK) strategy
Wainer strategy • Answered all items incorrectly in round 1, then tried to answer all items correctly in round 2. • Results in positive bias on theta • Maybe happens for high-ability person
Kingsbury strategy • Examinee could distinguish between current and previous item difficulties. • Examinee went back to change response if current item difficulty is easier than previous one. • Assumption: • (a) θ-δ <= -1, then make guess on current response • (b) θ-δ> 0.5, then go back to change response • Low-ability examinee is likely to get the benefit
Generalized Kingsbury strategy • Speculate on the difficulty level of the next item not only for items with guessed responses but also for all previous items. • Strategy offered no meaningful improvement in score estimates in most situations. • Only 61% successful in distinguishing the difficulty difference.
CAT with restricted revision options • Stocking (1997): reduce Wainer effect • Model 1: change response at the end of test with limited number of item • Failed to control if allowable items were larger than 2 • Model 2: multiple separately timed sections and allowed to change responses within section • Model 3: allowed to revise responses only within each item set (common stimulus) • May feel anxiety when make decision to go • Cannot skip items • May use Kingsbury or GK to find clue
Item pocket method • Must answer in the end of test or be scored as incorrect • Advantages: • Reduce anxiety • Items can be skipped and put in the pocket one time • Items in pocket do not affect the interim score and item selection (in turn, make Kingsbryand GK strategies ineffective) • Need no section
Simulation 1 • If robust to Wainer-like strategy • Settings: • 500 items • fixed-length CAT 40 items • MLE • Sympson & Hetter (Rmax = 0.2); or not • Maximum number of items in IP: 0, 2, 4, 6 • Mean absolute error (MAE) and bias • Replications: 25
Simulation 1 • Assume examinees use Wainer-like strategy • Only IP items can be revised (preserve as many easiest items as possible, because examinees think put them in pocket will be scored as wrong). • Answer other non-IP items in normal way • IP size is limited. • Impact on the final score estimates • Not often happen in practical
Simulation 2 • Assumed examinees evaluated the relative difficulty of each item against their proficiency. • 50% finding out a challenging item and put it in pocket if |θ-δ| < 0.5, otherwise 70%. (preserve challenging items) • If IP is full, examinee compare the easiest item in pocket with current challenging item. • If the “easiest” item is easier than challenging item, answer it and put challenging item in pocket. (using 50%&70% rule) • No time limit, no fatigue
Results for simulation 2 • MAE increased by .069, .084, .087 for 2, 4, 6 IP size. • Increase in average bias were .057, .075, .080
Low-ability examinees were likely to see more difficulty items ( due to simulation settings), but not for high-ability examinees.
Discussion • Time limit should be considered • For low-ability examinee, most items put in IP were those initial items due to item selection algorithm selecting an item was based on initial estimate (abound 0).
Conclusion • IP • may reduce anxiety • Minimized the effect of Wainer-like strategy • Immune to Kinsbury and GK • IP size: • Too small or too large
Questions • Why the mean bias is not close to zero when IP size is zero? • I'm curious that why no difference was found between the no exposure control condition and SH method condition?
Future study • Fixed-precision CAT • Everyone has different ability (probability) to tell item difficulty. • Elapsed time of skipping an item • Multiple choice item • Possible to trick IP method? • Utilizing information of IP item (MNAR)