180 likes | 303 Views
Intro to Stats. Reliability & Validity. Why discuss?. Limits all inferences that can be drawn from later tests If reliable and valid scale, can have confidence in findings If unreliable or invalid scale need to be very cautious. Measurement . Related measures & outcomes. Item 1. CONSTRUCT.
E N D
Intro to Stats Reliability & Validity
Why discuss? • Limits all inferences that can be drawn from later tests • If reliable and valid scale, can have confidence in findings • If unreliable or invalid scale need to be very cautious
Measurement Related measures & outcomes Item 1 CONSTRUCT Item 2 Unrelated measures & outcomes Item 3
Correlation coefficient • Captures how the value of one variable changes when the value of the other changes • Ranges from -1 to +1 • A Pearson correlation is based on continuous variables • Important to remember this is a relationship for a group, not each person/item • Reflects the amount of variability shared by two variables
Computations rxy = n ΣXY - ΣX ΣY [n ΣX2 – (ΣX)2][n ΣY2 - (ΣY)2] • rxy = correlation coefficient between x & y • n = size of sample • X = score on X variable • Y = score on Y variable
Interpretations of Size .80 to 1.0 very strong .60 to .80 strong .40 to .60 moderate .20 to .40 weak .00 to .20 weak/none Relationships of .70 or stronger are generally considered acceptable in reliability analyses
Reliability • The extent to which a scale measures construct consistently • Any measurement is an observed score • Reliability = true score/ (true score + error) • Less error = observed score is closer to true score (more reliable) • We never know the “true score”
1. Test-retest reliability • Extent to which a test is reliable over time • Calculate the correlation between two time points for each person • Items should relate positively *Sometimes you expect the scores to be different
2. Parallel forms reliability • Extent to which two forms of a test are equivalent • Calculate the correlation between the two forms of the test
3. Internal consistency reliability • Extent to which items are consistent with one another and represent one dimension • Correlation between individual scores and the total score • Also estimate correlations among the items • Important that all items use the same scale and be in the same direction • Cronbach’s alpha (α)
Cronbach’s alpha α = k s2y – Σs2i k-1 s2y k = number of items S2y =variance associated with observed score Σ s2i =sum of all variances for each item
4. Interrater reliability • Agreement between two raters ir = # of agreements # of possible agreements
Validity • The extent to which the scale measures what it is intended to measure • Can be reliable without being valid
1. Content Validity • Items sample the universe of items for a construct • Can ask an expert (or several) whether items seem representative
2. Criterion Validity • Scale relates to other measures or behaviors in ways that would be expected • Concurrent • At same time • or predictive • Predicts later scores
3. Construct validity • Scale measures the underlying construct as intended • Relation to the behaviors that the construct represents