Criterion-Referenced Measurement: Advantages, Limitations, and Applications

Chapter 7 Criterion-Referenced Measurement Poor Sufficient Better

Criterion-Referenced Testing • Mastery Learning • Standard Development Judgmental Normative Empirical Combination

Guidelines for Writing Behavioral Objectives(Mager , 1962) • Identify the desired behavior by name • Define the desired behavior • Specify the criteria of acceptable performance

Advantages of Criterion-Referenced Measurement • Represent specific, desired performance levels linked to a criterion • Are independent of the proportion of the population that meets the standard • If not met, specific diagnostic evaluations can be made • Degree of performance is not important . . . reaching the standard is

Limitations of Criterion-Referenced Measurement • Cutoff scores always involve subjective judgment • Misclassifications can be severe • Students who meet the cutoff may no longer be motivated to do better

Setting a Cholesterol “Cut-Off” N of deaths Cholesterol mg/dl

Statistical Analysis of CRTs • Nominal Data • Contingency Table Development • Phi Coefficient (PPM) • Chi-Square Analysis

Considerations with CRT • The same as norm-referenced testing • Reliability Consistency of measurement • Validity Truthfulness of measurement

Figure 7.1 (a)FITNESSGRAM Standards Below the criterion VO2max Above the criterion VO2max Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test

Figure 7.1 (b)AAHPERD Physical Best Standards Below the criterion VO2max Above the criterion VO2max Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test

Meeting Criterion-Referenced StandardsPossible Decisions

Table 7.1Test-Retest Reliability Example Day 2 P = .825 K = .576 Phi = .586 C2 = 137.13, df = 1, p < .001

Table 7-2Criterion-Referenced Equivalence Reliability Between the 1 Mile Run/Walk and PACER

Figure 7.3A theoretical example of the divergent group method

Examples of Criterion Referenced Standards • Cholesterol < 240 mg / dl • Systolic Blood Pressure < 140 mmHg • Diastolic Blood Pressure < 90 mmHg • FITNESSGRAM 1-mile run time for boy age 10 < 11:30 • President’s Challenge Health Fitness curl-ups girl age 14 > 24

Day 2 Fail Pass Fail Day 1 Pass CRT Reliability

Criterion Fail Pass Fail Field Test Pass CRT Validity

Racquetball Example • Can a wall volley test serve as a good criterion measure to determine who should enter intermediate racquetball? • Example • Reliability study • Validity study

Front Wall 2 extra racquetballs You must always hit the ball from behind the broken line The test 4 attempts 30 secs each Trial 1 attempts 1 + 2 Trial 2 attempts 3 + 4

Reliability study • Set a standard for passing the field test • Our standard is set at 25 hits • You must hit the ball against the front wall at least 25 times in a trial. This meets the “standard” for entry into intermediate racquetball • Recall a trial consists of the total of two attempts • You want to see if players can achieve the standard on each trial. If you determine the consistency of their meeting the standard, this is a criterion-referenced reliability study.

Failed to meet standard (< 25) Met the standard (>= 25) Failed to meet standard (<25) Met the standard (>=25)

SPSS output This field test demonstrates acceptable criterion-referenced reliability Chi square = 23.6, p < .001 Phi = 0.65 Percent agreement = (37 + 11)/56 = 48/56 = 85.7%

Validity study • The standard for passing the field test is 25 hits • We need a criterion measure of TRUE racquetball ability • We used self reported racquetball experience • Inexperience = novice player • Experienced = skilled OR completed beginning racquetball class • You want to see if experienced players are more likely to achieve the standard on the field test and the inexperienced players are less likely to meet the field test standard. This is a criterion-referenced validity study.

Experienced Inexperienced < 25 hits >= 25 hits

SPSS output Trial 1 vs. Criterion Chi square = 6.7, p < .01 Phi = 0.35 Percent agreement = (33 + 8)/56 = 41/56 = 73%

SPSS output Trial 2 vs. Criterion The results of the TWO validity studies suggest this field test and the criterion of 25 hits is a moderately valid measure of racquetball experience Chi square = 4.8, p < .03 Phi = 0.29 Percent agreement = (30 + 9)/56 = 39/56 = 70%

Table 7-8 Research Designs in Epidemiology Type Description Experimental Randomized clinical trial Randomly assign subjects to treatments or exposures Community trial Randomly assign whole communities to treatments or exposures Observational Cases series Noting cases at a particular time or place Cross-sectional A snapshot of identifiable groups at one point in time Proportionate mortality or morbidity study Compare results of a study group to the population Case-control Known cases of mortality or morbidity are compared to matched non cases Cohort Longitudinal, generally long term tracking of populations

Epidemiological statistics • Incidence – the number, proportion, rate, or percentage of new cases of mortality and morbidity. Incidence could be calculated in a randomized clinical trial or a prospective, longitudinal cohort study. • Prevalence – the number, proportion, rate, or percentage of total cases of mortality and morbidity. Prevalence would be calculated in a cross-sectional study.

Estimates of Risk • Absolute risk - the risk (proportion, percentage, rate) of mortality or morbidity in a population that is exposed or not exposed to a risk factor. • Relative risk - the ratio of risks between the exposed or unexposed populations. This statistic is calculated with incidence measures. • Odds ratio - an estimate of relative risk used in prevalence studies. • Attributable risk - the risk of mortality and morbidity directly related to a risk factor. It can be thought of as the reduction in risk related to removing a risk factor.

Table 7-9Results of a hypothetical study relating cholesterol and heart attack mortality Exposure Outcome Heart Attack Deaths No Heart Attack Deaths High Cholesterol A 25 B 31 No High Cholesterol C 7 D 37

Criterion-Referenced Measurement: Advantages, Limitations, and Applications

Criterion-Referenced Measurement: Advantages, Limitations, and Applications

Presentation Transcript

Chapter 7

Chapter 7

Chapter 7

CHAPTER 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7