370 likes | 448 Views
The State Assessment Database and NAEP. Don McLaughlin June 24, 2006. Outline. History of the database School as a unit of analysis Different measures, different grades, different tests, different subjects: does it matter? Multi-state analyses Demographic adjustments
E N D
The State Assessment Databaseand NAEP Don McLaughlin June 24, 2006
Outline • History of the database • School as a unit of analysis • Different measures, different grades, different tests, different subjects: does it matter? • Multi-state analyses • Demographic adjustments • Enhancing NAEP
History of the database • 1994 NAEP Secondary Analysis Grant • Used state assessment scores to validate the NAEP school substitution procedure. • 1997 Linked state assessment scores to SASS and NAEP • 1998 Reports to State Assessment Directors • Relations between NAEP and state assessment results • 2000 PES funding for the NLSLSASD • 2001 “Dispelling the Myth Online” (The Education Trust); GreatSchools.Net • 2003 NAEP State Analysis Project funds the NLSLSASD • Extended reports on the relations between NAEP and state assessment results • 2006 Scores for ’03/’04 and ’04/’05 added.
School as a unit of analysis • How different are student population achievement statistics based on school averages from those based directly on student records? • Means • Standard deviations • Relations (Regression coefficients)
School as a unit of analysis Source: Statistics are based on 2003 state assessment results in four states, estimated from scores of students in the NAEP sample.
School as a unit of analysis Population averages based on school means are the same as averages based directly on student records.
School as a unit of analysis Achievement standard deviations based on school means are between 35% and 50% as large as when based directly on student records.
Standard score difference associated with all vs. none of students eligible for free or reduced price lunch Based on linear regression coefficients, predicting achievement scores from student eligibility for free or reduced price lunch. (Free, reduced, not eligible)=(1,.5,0).
Standard score difference associated with all vs. none of students eligible for free or reduced price lunch Poverty is more strongly associated with whole school achievement than individual student achievement.
Standard score difference associated with all vs. none of students eligible for free or reduced price lunch The association between poverty and achievement appears in both state assessments and NAEP.
Different tests and measures: does it matter? • The question is not whether percentile ranks and scale scores are the same, or whether the SAT/9 and Terra Nova are the same, but whether the results of analyses would be the same if a different measure were used. • The critical statistic is the correlation between the measures.
Different tests and measures: does it matter? • To find how well different measures were correlated, I extracted over 20,000 of the school-level state assessment score correlations that can be computed among achievement measures on the NLSLSASD. • 8,076 correlations were between different statistical summary measures for the same test in the same grade, subject, year, and state. • 95 correlations were between different tests in the same grade, subject, year, and state, with the same summary statistic.
Different measures: does it matter? • The average correlation between raw scores, scale scores, percentile ranks, median scores, normal curve equivalents, accountability indexes, and percents meeting a mid-level standard is .95. • If one measure is the percent meeting an extreme standard, the average correlation is .77.
Different tests: does it matter? • The average correlation between scores on two different tests in the same subject, grade, and year is .92. • Based on scores in 5 states • Excludes Texas, in which two tests were an English and Spanish version
Different grades: does it matter? • The average correlation between scores on two tests in the same subject and year, one grade apart, is .76. • Based on 680 correlations in 25 states • If the scores are based on the same student cohort (in adjacent grades in successive years), the average correlation is .04 higher. • Based on 380 matched pairs of correlations
Different subjects: does it matter? • The average correlation between scores on tests of two different subjects in the same grade and year is .86. • Based on 3,142 correlations among reading, mathematics, language arts, and science, in 50 states
Multi-state analyses • Many educational programs and many databases cover a wide range of states. • However, due to differences in testing between states, analyses of the NLSLSASD must be “within-state.” • For example, state standards for “proficient” vary widely.
Multi-state analyses • Multi-state analyses of state assessment scores, which pool the within-state analyses, have substantial power, • but they cannot detect between-state relations between policy and achievement outcomes. • NAEP can detect between-state variation in achievement scores.
Demographic adjustments • In comparing average achievement in schools participating in a program, such as Schoolwide Title I, with other schools in the same state, to be fair it is necessary to adjust for demographic differences, such as poverty. • Estimate expectations based on comparison schools and compare to actual achievement in program schools.
Demographic adjustments • Questions to ask: • Is the relation strong? • Is the relation linear? • Is the database of comparison schools sufficient to estimate the H0 relation? • Is the demographic adjustment permissible, in accord with program design?
Enhancing NAEP • In 1994, state assessment results were used to validate NAEP school substitution methods. • Since 1998, state assessment scores have been used to increase the efficiency of NAEP school samples. • Since 1996, comparisons with state assessment results have enriched NAEP reporting. • Since 2001, individual level state assessment-to-NAEP linkages have been used to validate NAEP full population estimates.
Validation of NAEP full population estimates • Problem: NAEP reports ignore the existence of a population of students who are selected for but excluded from participation in NAEP due to disabilities or limited English proficiency. • When exclusion increases, achievement gains are overestimated, when exclusion decreases, achievement gains are underestimated. • Solution: Estimate the achievement of excluded students by matching profiles on the SD/ELL questionnaire to profiles of included students with disabilities and limited English proficiency.
Validation of NAEP full population estimates • Question: how accurate are the imputations of achievement of excluded students? • HUMRRO generated Monte Carlo data to test the accuracy of the estimates. • The profile matching method proved very effective at eliminating bias. • I compared the state assessment scores of excluded students with the imputations. • How do scores of excluded students compare to included SD/ELLs and to non-SD/ELLs?
Validation of NAEP full population estimates Deficits, compared to non-SD/ELL achievement, based on 24 linkages. (Measured in standard deviation units)
Validation of NAEP full population estimates Deficits, compared to non-SD/ELL achievement, based on 24 linkages. (Measured in standard deviation units)
Validation of NAEP full population estimates Deficits, compared to non-SD/ELL achievement, based on 24 linkages. (Measured in standard deviation units)
Summary • Policy analyses that are nationally relevant can be done using school-level state assessment data, even though the tests differ between states. • NAEP adds power to the interpretation of state assessment data. • State assessment data enhance NAEP. • The combined State Assessment and NAEP database is more powerful than either is alone. Don.McLaughlin @ StatisticsandStrategies.com June 24, 2006
Sources • The NLSLSASD is available for free download at www.schooldata.org • To obtain 2004 and 2005 scores, contact StateData@air.org. • Various reports are available at www.schooldata.org. • For excluded student plausible values and for other reports, contact me at Don.McLaughlin@StatisticsandStrategies.com. Don.McLaughlin @ StatisticsandStrategies.com June 24, 2006