350 likes | 807 Views
Validity and Reliability. Validity . The degree of true fullness in a test Are we measuring what we think we are measuring To be valid a test must have relevance and reliability. Objectivity .
E N D
Validity • The degree of true fullness in a test • Are we measuring what we think we are measuring • To be valid a test must have relevance and reliability
Objectivity • The degree of interrater reliability; the ability of two or more raters to equivalently score a test.
Relevance • The degree to which a test pertains to the objectives of the measurement • (i.e. using body weight to predict fitness)
Types of validity • Content validity evidence • Criterion validity evidence • Construct validity evidence
Content related validity • Evidence of truthfulness based on logical decision making and interpretation. Also called face or logical validity. • a validity technique based on the subjectively established fact that the test measures the wanted attribute
Criterion related validity • Evidence that a test possesses a statistical relationship with the trait being measured; also called statistical validity and correlation validity. (expert rating, tournament standing, predetermined criteria) • Concurrent • Predictive
Concurrent validity • The relationship between a test ( a surrogate measure) and a criterion when the two measures are taken relatively close together in time. It is based on the PPM correlation coefficient between the test and criterion. (example run for time estimates VO2 max)
Predictive Validity • The relationship between a test ( a surrogate measure) and a criterion when the criterion is measured in the future. It is based on the PPM correlation coefficient between the test and the criterion.
Construct validity • The highest form of validity; it combines both logical and statistical evidence of validity through, the gathering of a variety of statistical information that, when viewed collectively, adds evidence for existence of the theoretical constrict being measured
Content related evidence • “Demonstrates the degree to which the sample items, tasks or question on a test are representative of some defined universe or domain or content.” • A written knowledge test yeilds scores on which valid interpretations can be made when • Its questions are reliable • Are based on material taught • Sample the stated educational objectives
Criterion related evidence • Demonstrates that test scores are systematically related to one or more outcome criteria • Based on having a true criterion measure available
Development of criterion • Actual participation • Know valid criterion • Expert judges • Tournament participation • Known valid test
Construct related evidence • Often used to validate measures that are unobservable, yet exist in theory • Construct related evidence can be used to provide additional evidence for criterion related validation evidence.
Reliability • The degree to which repeated measurement of the same trait are reproducible under the same conditions; consistency
Scores • Observed • Error • True • True score=observed score + or – error score
Interclass reliability • Based on correlation between two measures • Test-retest • Test student twice with same instrument • Skill test • Equivalence- • group takes each of two test forms • Split halves • each half of test is evaluated as a separate test
Intraclass reliability • Based on ANOVA • KR20 • Alpha