1 / 35

26 th Annual MIS Conference February 13, 2013 Washington, DC

Disclosure Avoidance and the U.S. Department of Education’s School-Level Assessment Data Release. 26 th Annual MIS Conference February 13, 2013 Washington, DC. Michael Hawes Statistical Privacy Advisor U.S. Department of Education. Overview. Details of the Data Release

lyneth
Download Presentation

26 th Annual MIS Conference February 13, 2013 Washington, DC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disclosure Avoidance and the U.S. Department of Education’s School-Level Assessment Data Release 26th Annual MIS Conference February 13, 2013 Washington, DC Michael Hawes Statistical Privacy Advisor U.S. Department of Education

  2. Overview • Details of the Data Release • FERPA and the Need for Disclosure Avoidance • Disclosure Avoidance Techniques • Challenges • Strategy and Objectives • Process and Decisions • Selected Methodology • Evaluation • Questions

  3. Details of the Data Release • School-level Math and Reading Proficiency (# of Valid Tests, % Proficient) • By Grade • By Sub-group • Race/Ethnicity • Gender • Children with Disabilities • Economically Disadvantaged • Limited English Proficiency • Homeless • Migrant • School Years 2008-2009, 2009-2010, 2010-2011 • Available via https://explore.data.gov

  4. Data Release Public Data – Contains no PII

  5. FERPA and the Need for Disclosure Avoidance • FERPA protects personally identifiable information (PII) from education records from unauthorized disclosure • Requirement for written consent before sharing PII • Exceptions from the consent requirement for: • “Studies” • “Audits and Evaluations” • Health and Safety emergencies • And other purposes as specified in §99.31

  6. Personally Identifiable Information (PII) • Name • Name of parents or other family members • Address • Personal identifier (e.g., SSN, Student ID#) • Other indirect identifiers (e.g., date or place of birth) • “Other information that, alone or in combination, is linked or linkable to a specific student that would allow a reasonable person in the school community, who does not have personal knowledge of the relevant circumstances, to identify the student with reasonable certainty.” (34 CFR § 99.3)

  7. Reporting vs. Privacy • Department of Education regulations require reporting on a number of issues, often broken down across numerous sub-groups, including: • Gender • Race/Ethnicity • Disability Status • Limited English Proficiency • Migrant Status • Economically Disadvantaged Students • BUT, slicing the data this many ways increases the risks of disclosure, and the regulations also require states to “implement appropriate strategies to protect the privacy of individual students…” (§200.7)

  8. Disclosure Avoidance Techniques Publishing even aggregate data may include PII – You must use disclosure avoidance!!! Three Basic Flavors: • Suppression • Blurring • Perturbation

  9. Suppression

  10. Blurring

  11. Perturbation

  12. Challenges • “Reasonable Person” Standard • Small Subgroups Challenge • State-published Data

  13. Strategy and Objectives In 2011, the Department created a Data Release Working Group to develop a coordinated Departmental strategy for protecting privacy in public data releases.

  14. Data Release Working Group

  15. Strategy and Objectives DRWG Objectives: • Cross-departmental coordination • Protect privacy while ensuring maximum data utility (n.b., data utility ≠ data quality)

  16. Process and Decisions • Two-fold Data Release • Public Data Release (privacy protected) • Restricted-use NCES Licensing (raw data) • Qualified researchers • Security requirements • NCES’ (ESRA) legal protections • FERPA-permitted uses

  17. Process and Decisions • Examination of State Methods • PTAC review of each state’s disclosure avoidance for assessment data • For more information, please come see today’s 4:15 session: Session V-F “Privacy Technical Assistance Center’s (PTAC) Analysis of State Public Reports 4:15-5:15 Massachusetts Room”

  18. Process and Decisions • Most states are using suppression to protect privacy • Implementation details vary widely (e.g., suppression at cell or subgroup level, with/without complementary suppression) • Suppression threshold (n-size) varies from 5-30 (most use n=10)

  19. Process and Decisions Privacy Threshold • Each reported cell is really two cells (% proficient, % not proficient) • Rule of Three • Goal: For any reported data, to avoid knowing with certainty the proficiency of students when there are fewer than three individuals in either the proficient and/or non-proficient category • Implications for subgroups with fewer than six valid tests, and (for larger groups) for extreme values where there are fewer than three individuals in either outcome category.

  20. Process and Decisions Initial Attempt: • Primary suppression (n<6) • Top/Bottom coding of extreme values • Complementary suppression

  21. Process and Decisions Result: Stock Image

  22. Process and Decisions Fictitious Data – Contains no PII

  23. Process and Decisions Fictitious Data – Contains no PII

  24. Process and Decisions Problems with Initial Approach: • “Swiss-cheese” effect • Implementation Challenges • Vulnerability We needed something SIMPLER, that made more data available for small groups

  25. Selected Methodology • Primary Suppression of small cells (n=1-5) • “Blurring” of remaining data • Top/Bottom coding of extreme values • Ranges for subgroups with n=16-300 • Whole number percentages for n>300 • Special rule for All Students/All Grades

  26. Selected Methodology Magnitude of reported ranges determined by the size of the group:

  27. Process and Decisions Fictitious Data – Contains no PII

  28. Process and Decisions Fictitious Data – Contains no PII

  29. Evaluation • More (albeit less precise) data about small subgroups • Cross-comparison of state reports with ED methodology does not increase disclosure risk • Simple to program • Your feedback?

  30. Questions Why are you reporting the number of valid tests for small subgroups? Isn’t that a disclosure?

  31. Questions Why are you reporting the number of valid tests for small subgroups? Isn’t that a disclosure? • Number of Valid Tests ≠ Number of Students • Movement in/out of the district • Unreadable or damaged tests • Students absent during testing • Number Valid often published on state websites – US ED did not want to rely on privacy protection through obscurity

  32. Questions The data that ED has reported doesn’t match the data on our state/school website. Why?

  33. Questions The data that ED has reported doesn’t match the data on our state/school website. Why? • States may update their websites on different schedules than they use to report to ED. • States may only count students who were present for the full academic year (ED includes all student with valid tests regardless of academic year status).

  34. Other Questions Stock Photo

  35. Contact Information • Michael Hawes • Statistical Privacy Advisor • U.S. Department of Education • Michael.Hawes@ed.gov • (202) 453-7017

More Related