1 / 42

An Empirical Study to Examine Whether Monetary Incentives Improve 12 th Grade Reading Performance

An Empirical Study to Examine Whether Monetary Incentives Improve 12 th Grade Reading Performance. Henry Braun Irwin Kirsch Kentaro Yamamoto Boston College ETS ETS Presented at the PDII Conference Princeton, NJ October 3, 2008. What is NAEP?

cira
Download Presentation

An Empirical Study to Examine Whether Monetary Incentives Improve 12 th Grade Reading Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Empirical Study to Examine Whether Monetary Incentives Improve 12th Grade Reading Performance Henry Braun Irwin Kirsch Kentaro Yamamoto Boston College ETS ETS Presented at the PDII Conference Princeton, NJ October 3, 2008

  2. What is NAEP? Why this study? What were the design criteria? How were they operationalized? Do monetary incentives make a difference? If so, which ones, how much and for whom? How robust are the findings? What are the implications? Overview

  3. Large-scale national/state surveys of academic achievement begun in 1969 Tests students in grades 4, 8, 12 Subjects: Reading, Mathematics, Science, Geography, Civics, History, etc. NAEP (“The Nation’s Report Card) provides a snapshot of student achievement overall, by state and by various subgroups The National Assessment of Educational Progress

  4. National sample only Lower participation rates than grades 4 and 8 Concerns about levels of motivation/effort Undergoing expansion to state level 12th Grade NAEP

  5. Increasing reliance on low-stakes large-scale assessment surveys for education policy Issues relate to both national and international LSASs In the U.S., NAEP is the only source of nationally comparable data on student achievement that can be used for state-level comparisons Under NCLB, 4th and 8th grade NAEP play an expanded role in monitoring state-level results Strong interest in expanding role of 12th grade NAEP National Commission on 12th grade NAEP (2004) recommendations Redesign to report on student readiness Expand to state level Increase participation and motivation Study Rationale

  6. Goal: To estimate the effects of different monetary incentives on student performance on 12th grade NAEP Internal validity External validity Adequate power Design Criteria

  7. Experiments Focus on mathematics O’Neil et al (NAEP items) Baumert et al (PISA items) Psychology Intrinsic vs. extrinsic motivation Behavioral Economics Monetary incentives can work Participants must be cognizant of incentives Literature

  8. Focus on NAEP reading Randomized trial for internal validity Prepared detailed implementation protocol Employed experienced administrative staff External validity (i.e., link directly to NAEP) Used released NAEP materials Followed NAEP administrative and data processing procedures Carried out NAEP-like psychometric and statistical analyses Heterogeneous school sample Large study for sufficient power to detect effects Study Features

  9. Control: Standard NAEP instructions Incentive 1: Standard NAEP instructions + Promise of a $20 gift card at conclusion of session Incentive 2: Standard NAEP instructions + $5 Gift card + $15 for a correct answer to each of two questions to be chosen at random at the conclusion of session Study Design: Incentives

  10. All students in both incentive conditions were asked to select Target or Barnes & Noble for the gift card and to indicate their preference on a sign-up sheet Students in all three conditions actually received $35 gift cards at the end of the sessions Students were informally debriefed before leaving Study Design: Incentives (2)

  11. Mapping to the NAEP Reading Framework (3 contexts) * Reading for literary experience (35%) * Reading for information (45%) Reading to perform a task (20%) Assembling test booklets 2 reading blocks + background questionnaire Each reading block consists of a passage and a set of associated questions Each block is expected to take 25 minutes Blocks vary with respect to the total number of questions and the proportions of multiple choice, short answer and extended response questions Study Design: Instrumentation

  12. Booklet Design

  13. Items drawn from operational questionnaire Two sets of items Set I Demographics and parental education Home environment School absences Set II Reading practices Future educational expectations Level of effort Survey Design: Background Questionnaire

  14. Power analysis indicated need for a sample of 60 schools with 60 students per school (20 per condition in each school) Worked with NAEP state coordinators and Westat to obtain a (final) convenience sample of 59 schools Student recruitment was carried out using standard NAEP methods (but no special incentives) Number of participating students was lower than target Study Design: Sample Selection

  15. Student Response Rates by State

  16. Random samples of 12th graders invited to participate In each school students randomly allocated to the three conditions Fall (not spring) administration Sessions in a school were simultaneous or consecutive to eliminate possibility of contamination Limited accommodations No make-up sessions Administration

  17. Student Response Rates by Condition

  18. Scoring was conducted by NCS/Pearson Preliminary item analysis held no surprises: Differences by condition in Proportions correct Percentage of omitted items Highest for extended CR items Percentage of off-task responses Generally very small (<< 1%) Percentage of items not reached Particularly high for last CR item Data Preparation: Scoring and Item Analysis

  19. Average Item Proportions Correct by Item TYPE and Incentive Condition

  20. Scaling by subscale Fit item characteristic curves to data Compare to archival results Estimate three-group model Reasonable fit Conditioning Combine cognitive data with ancillary data from questionnaires Obtain posterior score distribution for each student Generate “plausible values” Linking Linear transformation to the NAEP scale Construct composite reporting scale Data Preparation: Scaling, Conditioning and Linking

  21. Effect Sizes by Subscale, Item parameters Based on Study Data Only Effect Sizes by Subscale, Item parameters Based on Archival Data

  22. Effects of incentives range from 3 to 5 points on the NAEP scale overall Male-female differences relatively stable White-Black and White-Hispanic differences grow somewhat larger under incentives Effects of incentives generally positive for subgroups Estimates reasonably robust Selected Results

  23. Comparison of Effects by Incentive Condition

  24. Study Statistics by Incentive and Gender 5 5 5 5 6 6

  25. Study Statistics by Incentive and Race/Ethnicity 6 2 3 20 28 24

  26. Study Statistics by Condition, Gender, and Race/Ethnicity

  27. Study Statistics by Condition, Gender, and Mother’s Education Level

  28. Study Statistics by Condition, Gender, and Mother’s Education Level

  29. Study Statistics by Condition, Gender, and Number of Days Absent From School Last Month

  30. Study Statistics by Condition, Gender, and Number of Days Absent From School Last Month

  31. Study Statistics by Condition, Gender, and Frequency of Reading for Fun on Own Time

  32. Study Statistics by Condition, Gender, and Frequency of Reading for Fun on Own Time

  33. Although treatment groups were determined randomly, there were differences in various characteristics that might have contributed to the estimated treatment effects. We ran an ANOVA adjusting NAEP scores for a number of demographic and home environment characteristics, as well as students’ reading habits. The ANOVAs were run separately for males and females. They yield adjusted least squares means that can be compared to the raw means. Sensitivity Analysis (1)

  34. Impact of “leverage” groups was examined by identifying those subgroups with the largest positive effect (Incentive 2) and a large enough sample size to rule out sampling fluctuations. (i) Male, White, Absent more than 3 days in the last month [Effect ~3x larger than overall effect for males] [95/802] Removing this group would reduce effect of Incentive 2 by ~25%. (ii) Female, Hispanic, Not ELL [Effect ~3x larger than overall effect for females] [82/919] Removing this group would reduce effect of Incentive 2 by ~13%. Sensitivity Analysis (2)

  35. Summary • Data clearly indicate that the design criteria for this • study were met • Monetary incentives improve NAEP reading performance • Type of incentive makes a difference • by reporting subgroup • by quantile

  36. Caveats • Fall rather than Spring administration • Represented two of the three NAEP subscales • Lower student participation rate than in operational NAEP • Subgroup sample size • Relationship of the sample to the NAEP population

  37. Implications • 12th grade NAEP results should be interpreted cautiously • Expansion of 12th grade NAEP ought to wait on policy • action on incentives • Measuring reading as NAEP does may be problematic in the • current context • In modifying NAEP cognitive instruments (e.g. for readiness), • the administrative setting should be taken into account

More Related