1 / 19

EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY

EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY. George Karabatsos. GENERALIZABILITY THEORY. TRUE SCORE MODEL. Recall the true score model: X +n Observed Test Score of person n, T n True Test Score (unknown) e n Random Error (unknown) . TRUE SCORE MODEL.

gin
Download Presentation

EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EPSY 546: LECTURE 3GENERALIZABILITY THEORYANDVALIDITY George Karabatsos

  2. GENERALIZABILITY THEORY

  3. TRUE SCORE MODEL • Recall the true score model: X+nObserved Test Score of person n, Tn True Test Score (unknown) enRandom Error (unknown)

  4. TRUE SCORE MODEL • Recall the true score model: • One may view that the true score model narrowly defines error. 1 variable, simple ANOVA: Between (true score) var + Within (random error) var.

  5. GENERALIZABILTY THEORY • Generalizability Theory extends the true score model by acknowledging that multiple factors affect the measurement variance. • Multivariable ANOVA: The observed test response is a function of 2 or more variables, their interactions, and random measurement error.

  6. G-THEORY MODEL (example) • Xnjt=  Grand mean • + n–  Person n’s effect • + j–  Item j’s effect • + t–  Time t’s effect • + nt– n– t + Person  Time effect • + nj– n– j + Person  Item effect • + tj– t– j + Time  Item effect • + residual Three way • interaction, and error

  7. G-THEORY VARIANCE PARTITION Systematic Persons 2P Measurement Error (facet contributions) Items 2I Time 2T Person  Time 2 PT Person  Item 2 PI Time  Item 2 TI 3-way inter + error 2PIT, error

  8. G-THEORY OF DECISIONS • Relative decisions: Decisions based on the rank ordering of persons (e.g., college admission, pass-fail testing). • Variance contributing to measurement error for relative decisions: • 2Relat= 2PI+2PT + 2PIT,error • (all variance components associated with the interaction of persons)

  9. G-THEORY OF DECISIONS • Absolute decisions: Decisions based on the level of the observed score, without regard to the performance of others. (e.g., driver’s license). • Variance contributing to measurement error for absolute decisions : • 2Abs= 2T+2I + 2PI+2PT + 2IT + 2PIT,error • (all variance components associated with the facets, which introduce “constant” effects to absolute decisions)

  10. GENERALIZABILITY COEFFICIENT • Indicates how accurately the observed test scores allows us to generalize about persons’ behavior in a designed universe of situations (Cronbach, 1972).

  11. STUDIES • G-Study (Generalizability Study): Aims to estimate the variance components underlying a measurement process by defining the universe of admissible observations as broadly as possible.

  12. STUDIES • D-Study (Design Study): Using G-study results to address “what if” questions about variation in measurement design (Thompson & Melancon, 1987). This helps pinpoint sources of error to specify protocol modifications to obtain the desired level of generalizability.

  13. EXAMPLES OF G- THEORY • Nice illustrations are offered in: Webb, Rowley, & Shavelson (1988) and Crowley, Thompson, & Worchel (1994)

  14. VALIDITY

  15. TEST VALIDITY • VALIDITY: A test is valid if it measures what it claims to measure. • Types: Face, Content, Concurrent, Predictive, Construct.

  16. TEST VALIDITY • Face validity: When the test items appear to measure what the test claims to measure. • Content Validity: When the content of the test items, according to domain experts, adequately represent the latent trait that the test intends to measure.

  17. TEST VALIDITY • Concurrent validity: When the test, which intends to measure a particular latent trait, correlates highly with another test that measures that trait. • Predictive validity: When the scores of the test predict some meaningful criterion.

  18. TEST VALIDITY • Construct validity: A test has construct validity when the results of using the test fit hypotheses concerning the theoretical nature of the latent trait. The higher the fit, the higher the construct validity.

  19. MESSICK’S UNIFIED CONSTRUCT VALIDITY • Content: Item content relevance, representativeness, and technical quality (includes face). • Substantive: Theoretical rationales for the observed consistencies in the test responses. • Structural: Fidelity of scoring structure to the structure of the content domain. • Generalizability: The extent to which the score properties and interpretations generalize over population groups, settings, and tasks. • External: Concurrent/convergent, discrim., pred. • Consequential: refers to the (potential and actual) consequences of test use.

More Related