110 likes | 234 Views
Test Development. Test conceptualization Test construction Test tryout Item analysis Test revision. Test construction. Scaling: process of setting rules for assigning numbers in measurement; how numbers are assigned to different amounts of the construct being measured
E N D
Test Development • Test conceptualization • Test construction • Test tryout • Item analysis • Test revision
Test construction • Scaling: process of setting rules for assigning numbers in measurement; how numbers are assigned to different amounts of the construct being measured • Scaling methods based on objectives of test
Scaling Methods • Rating scales:less to more continuum • Summative scale: ratings on each item are summed for total • Likert scale: 5 or 7 response alternatives, usually agree/disagree continuum • Guttman scale: weaker to stronger expressions; all respondents who agree with stronger statements also agree with milder statements • Ordinal measurement
Test Tryout • What is a good item? • Item Analysis • Item-difficulty index: proportion of total testtakers who get item right (p=0 to 1) • Higher the p, easier the item
Good items (cont.) • ITEM-VALIDITY INDEX: degree to which a test measures what it intends to • Function of the item-score standard deviation & correlation between item and criterion • Item-validity index=
Good items (cont.) • ITEM-RELIABILITY INDEX: indication of the internal consistency of the test • Product of the item-score standard deviation and the r between the item score and total test score
Good items (cont.) • ITEM-DISCRIMINATION INDEX: Does the item differentiate (discriminate) between high scorers and low scorers? • Difference between the proportion of high scorers answering an item correctly and the proportion of low scores answering the item correctly • The higher the d value the greater # of high scorers answering correctly; - d value is a red flag • Range from -1 to +1 • Highest & lowest 27%
Good items (cont.) • ANALYSIS OF ITEM ALTERNATIVES: How many test takers chose each alternative on multiple-choice test? • Are there patterns? • Any one distractor that attracts answers? • How do high scorers and low scorers compare?
Guessing • Correction formula: • Assumes all alternatives are equally plausible and that items can be divided into those an examiner knows perfectly and those he/she doesn’t know. • Ex. Estimated # correct = Right - Wrong/k-1 • (k = 3 of alternatives) • ex. 50-15/3=45