Exploring Scores, Reliability, and Test Equivalence in Psychometrics

Session 3 Normal Distribution Scores Reliability

Mean, Median, Mode -4s -3s - 2s -1s 0 1s 2s 3s 4s s.d 10 20 30 40 50 60 70 80 90 T score 200 300 400 500 600 700 800 CEEB 55 70 85 100 115 130 145 Wechsler 52 68 84 100 116 132 148 SB Normal Curve

Age or grade equivalent scales • Age Equivalence: • Can really only compare to same age. • Grade Equivalence: • Can really only compare to same grade • Problems: • Norm referenced so the groups are not comparable • “Lake Woebegone syndrome” • Development is not linear

Norm group • How were people recruited and how many? • Random, Stratified, Cluster, Convenience. • Who was included and who was excluded? • Age, gender, ethnicity, national origin, SES, geographic, educational background, diagnosis. • How appropriate is the norm group for your client?

Reliability - Consistency • Classical Test Theory • Observed score = True Score + Error • A measure of reliability provides an estimate of the amount of true variance to observed variance. • If an instrument manual reports score reliability of .79 then 79% of the variance is true to observed variance and 21% is error variance.

Reliability • Systematic error versus unsystematic error • Error variance is unsystematic error • Test-taker variables • Test-administration variables

Correlation Coefficients • Consistency between two sets of scores. • Correlation is often used (e.g. Pearson product moment correlation) • r ranges from -1 to +1 and represents the relationship between the two sets of data. • The closer the number is to |1|, the stronger the relationship between the two sets of scores. Closer to |0|, the r indicates a lack of evidence of a relationship. • The – and + represent direction of the relationship only. Inverse (negative) or positive

Coefficient of Determination • r = .70 …..r2= .49 means 49% of the shared variance between the two sets of scores.

Types of Reliability • Test-Retest • Alternate or Parallel Forms • Internal Consistency • Split-Half (if this is appropriate) • Internal consistency • KR-20 (homogeneous domain) and • the KR-21 (heterogeneous domain) • Coefficient alpha or Cronbach’s alpha

Standard Error of Measurement • Standard Error of Measurement (SEM) offers a test-taker the range of where his or her true score would fall if s/he were to take the test multiple times. SEM = s  1 – r) Where s = the standard deviation for the test r = the reliability coefficient for the test

Example A Wechsler test with a split-half reliability coefficient of .96 and a standard deviation of 15 gives us an SEM of 3 SEM = s ( 1 – r ) = 15  ( 1-.96) = 15 .04 = 15 x .2 = 3 Example: Luisa took the Wechsler test and received a score of 100. Build a “band of error” around Luisa’s test score of 100, using a 68% interval. A 68% interval is approximately equal to 1 standard deviation on either side of the mean. Luisa’s true test score = performance test score ± 1(SEM) = 100  (1 x 3) = 100  3 Chances are 68 out of 100 that Luisa’s true score falls within the range of 97 and 103. What about a 95% interval?

Exploring Scores, Reliability, and Test Equivalence in Psychometrics

Exploring Scores, Reliability, and Test Equivalence in Psychometrics

Presentation Transcript

Session 3

Session 3

Session 3

Session 3

Session 3

Session 3

Session 3

Session 3

SESSION 3

Session 3

Session 3

Session #3

Session 3

SESSION 3

Session 3:

Session 3

Session 3

SESSION 3

Session 3

Session 3