240 likes | 256 Views
46-320-01 Tests and Measurements. Intersession 2006. Writing Items. DeVellis (1991) Define Item Pool Avoid long items Appropriate level of reading Avoid double-barreled items Mix positively and negatively worded items Cultural/ethnic sensitivity. Item Format. Dichotomous format
E N D
46-320-01Tests and Measurements Intersession 2006
Writing Items • DeVellis (1991) • Define • Item Pool • Avoid long items • Appropriate level of reading • Avoid double-barreled items • Mix positively and negatively worded items • Cultural/ethnic sensitivity
Item Format • Dichotomous format • Two alternatives • Pros: Ease of construction and scoring, absolute judgment • Cons: memorization, chance of being correct
Item Format • Polytomous format • More than two alternatives • Pros: less chance guessing, fast time, distractors • Corrected scores: • Guessing?
Item Format • Likert format • Degree of agreement • Five alternatives vs. six • Reverse scoring • Category format • 10-point scale – why 10? • Remember context • Visual Analogue scale • 100 cm line
Item Format • Checklist • Usually adjectives • Q-Sort • Increases options (9) • Form normal distribution
Item Analysis • Purpose: shorten a test and increase reliability and validity • Item difficulty • Proportion who get the item correct • Probability of chance • Optimum level • Variable difficulty (0.3 to 0.7) • Internal criteria = test score
Discriminability • Extreme group method • Discrimination index • Negative discriminator • Point Biserial method • Small test n • Higher correlation, better the item
Table Explained • Class n = 60 • Discrimination: rough index = U – L • Item Difficulty: U + M + L • Items: • 2 = too easy • 7 = too difficult • 4 & 5 = negative discriminative value
Item Characteristic Curve • X axis: total test score (trait estimate) • Y axis: proportion of test-takers with the item correct • Often use class intervals
Discriminability • Best scenario
Item Response Theory • Each item has an item characteristic curve • Specific range of difficulty can be identified with a test characteristic curve • Difficulty and discriminability • Sample items • Peaked conventional vs. rectangular conventional vs. adaptive
Criterion-Referenced Tests • Specify objectives – aids learning • Give test to two groups • Exposed vs. not • Antimode – cutting score • Any problems with this?
Test Manuals • Proprietary - qualifications • Nonproprietary • Standards for Educational and Psychological Testing • *reflects changes in federal law and measurement trends affecting validity • testing individuals with disabilities or different linguistic backgrounds • new types of tests as well as new uses of existing tests * Taken from apa.org
Test Manuals • Should include: • How to administer (standard conditions) • How to score • How to interpret • Information on reliability, validity, norms • Be critical!
Base Rates and Hit Rates • What does this test contribute beyond what is already know? • Cutting score not necessarily correct decision • Hit rate vs. base rate comparison • False negatives and false positives
Taylor-Russell Tables • What does the test contribute beyond base? • Need • Definition of success • Base rate • Selection ratio • Test validity coefficient • Determines likelihood someone selected on basis of test will succeed
Taylor-Russell Tables Source: Fisher, Schoenfeldt, & Shaw (2003), Table 7.2
Taylor-Russell Tables • Best: validity high, selection rate low • Bad: validity low, selection rate high • Useless: no validity • Selecting low scorers?
Incremental Validity • Unique information from using a test • Predicting future behavior and self-ratings • Prediction should consider: • Simpler method? • Less expensive method? • Less subject strain?
Mental Measurements Yearbook • Test reviews