150 likes | 353 Views
What is a “good” (or “bad”) multiple-choice item ? (Trade secrets from a professional). Psychology and applied psychometrics Rationale for multiple-choice testing Item quality that can be assessed before the test:. 1. Balance of content level. 2. Format of stem.
E N D
What is a “good” (or “bad”) multiple-choice item? (Trade secrets from a professional) Psychology and applied psychometrics Rationale for multiple-choice testing Item quality that can be assessed before the test: 1. Balance of content level 2. Format of stem • Item quality that can be assessed after the test: 1. How difficult was the item? 2. How well did the item discriminate?! The Item Characteristic Curve (ICC) The point-biserial coefficient (r)
Wall’s Bad Example 1 (too many negative stems) What statement below does not follow from the fact that an FAP is ballistic? a. It can have extreme emotional content. b. It has only a single function. c. It cannot be varied or stopped once triggered. d. It is relatively independent of learning or experience. e. All of the above do not follow from that fact.
Wall’s Bad Example 2 (complex fill-in-the-blank formats (“unquestionized”)) In accordance with a decay theory of forgetting, if a participant learning a list of words was then subjected to one of the following procedures, he would forget the greatest number of words if_______ a. he slept for four versus two hours b. he learned other lists for four versus two hours c. he performed arithmetic problems for four versus two hours d. he performed crossword puzzles for four versus two hours e. All of the above would be comparable.
Of the five properties listed below, which are the most important in relation to the function of the plasma membrane in living cells? i. Selective permeabilityii. Strengthiii. Elasticityiv. Hydrophilicityv. Fluidity Wall’s Bad Example 3 (Compound questions) a. i, ii, and iiib. i and vc. ii and iiid. iii and ive. iv and v
Score on multiple-choice Score on essay Scatterplot for TT3 (W’04) r = +0.65
Cognitive Levels Student Activity Words in Item Stems Evaluation Appraise, evaluate, justify Judging based on established set of criteria Synthesis Producing something new from component parts Design, develop, create, formulate, construct Analysis Differentiate, compare-contrast, relate Breaking down material to see relationships, hierarchy Application Solve, show, demonstrate, compute Using a concept to solve a problem Comprehension Explain, predict, infer, account for, summarize Explaining-interpreting Knowledge Define, list, state, identify, label Remembering facts, terms, concepts, definitions
a. left-handers are slower to develop language function. b. in left-handers, recovery in the left hemisphere is more rapid. c. highly developed visual-spatial abilities compensate for language loss. d. in left-handers, the hemispheres are not as functionally lateralized. e. left-handers have several language centers. • FILL-IN-THE-BLANK • Damage in a left-handed person in the language center often has less drastic consequences than in a right-handed person because _______ “QUESTIONIZED” Why does damage in the language center of a left-handed person often have less drastic consequences than in a right-handed person?
Index of “difficulty” or “easiness” Multiple-Choice Item Analysis N = 335 . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.0 Perfect performance 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Chance performance 0 F D C B A A hypothetical item analysis for a single multiple-choice item: The item characteristic curve (ICC) Proportion getting item correct Student-performance groups
1.0 0.9 Perfect discriminator (between below-C and above-C students) 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Chance performance 0 F D C B A A hypothetical item analysis for a single multiple-choice item: The item characteristic curve (ICC) Proportion getting item correct Student-performance groups
1.0 0.9 0.8 Ogive 0.7 0.6 0.5 0.4 ?!! 0.3 0.2 0.1 0 F D C B A A hypothetical item analysis for a single multiple-choice item: (Possible item characteristic curves (ICC’s) ) Proportion getting item correct Student-performance groups
1.0 Item 43 0.9 0.8 Item 8 0.7 0.6 0.5 Item 7 0.4 Item 11 0.3 0.2 0.1 0 F D C B A Actual Item Analyses: good discriminators Proportion getting item correct Student-performance groups
Item 6 1.0 0.9 Item 12 0.8 0.7 0.6 Item 10 0.5 0.4 0.3 0.2 Item 42 0.1 0 F D C B A Actual Item Analyses: poor discriminators Proportion getting item correct Student-performance groups
The issue of discriminability: Biserial statistics N = 335 ? . . . . . . . . . . . . . . . . . . . . . . . . . . .
X Y Score on item (0 or 1) Overall score on test Student A 0 54 B 1 76 C 1 87 D 0 58 Illustration of obtaining the point-biserial correlation for a single item: etc.