1 / 17

Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census

Demonstrating practical evidence of disclosure protection in 2011 UK Census through intruder testing methodology. Includes considerations, feedback, validation, results, and conclusions from targeted record swapping. Use cases and insights for statistical data confidentiality.

bakere
Download Presentation

Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, 28-30 October 2013 Keith Spicer, Caroline Tudor and George Cornish

  2. Forthcoming Attractions • 2011 UK Census • SDC method: targeted record swapping • Sufficient uncertainty • Intruder testing: • Considerations • The intruders • Feedback • Validating Claims • Results • Conclusions

  3. 2011 UK Census • Context of user criticism in 2001 • Small cell adjustment • Poor utility of some outputs • Needed additivity and consistency • Evaluation of possible SDC methods • Record swapping selected • Swap households (individuals in communals) • Targeted to ‘risky’ records • Swap rate sufficiently low to maintain utility

  4. Level of Protection: Sufficient Uncertainty • SRSA 2007 – Personal information must not be disclosed • Impossible to get zero risk • There will be 1s, 2s and attribute disclosures in tables • Some will be real • Some will be fake • Census White Paper: “no statistics will be produced that allow the identification of an individual......with a high degree of confidence” • Needs to be “sufficient uncertainty”

  5. ICO Code of Practice • ICO = Information Commissioner’s Office, who oversee interpretation of Data Protection and Freedom of Information Acts • Issued Code of Practice in light of abortions FOI case • Encouraged empirical evidence of disclosure risk • Intruder testing of reconviction data by Ministry of Justice provided a steer

  6. Intruder Testing - Considerations • Recruitment of intruders • Security of Census database • Creation of pre-publication tables • Tables for own Output area (c. 300 population) • Tables for own MSOA (c. 7,500 population) • Maps for local areas • Unrestricted internet access (2nd laptop) • Briefing material • Validating claims • Ethical considerations

  7. Intruder Testing – The intruders • 18 intruders • ONS staff or contractors with security clearance • Few with SDC experience • All with excellent IT skills adept with data • Range of grades up to Divisional Director • Range of local areas in England & Wales • Availability for at least ½ day

  8. Intruder Testing – Other issues • Intruders claims • Only general feedback given • No specific claim confirmed or denied • Checking claims • Potentially of people the checker knows (e.g. A self-identification made by work colleague) • Consent of intruders • Websites • Paying for access • Retaining search details (intruder’s identity) • Laptops wiped after each intruder

  9. Intruder Feedback • For each claim: • Name of person • Address • Table and cell reference • Type of claim: identification or attribute (and which attribute) • Reasoning, variables, tables, websites used • Level of confidence in claim

  10. Intruder Feedback • Intruders took between 1.5 and 6 hours • 16 of 18 intruders made at least one claim • >50 claims made in total • Tables looked sensible for their areas • Swap rate looked low • Generally intruders felt utility preserved

  11. Validating Claims • Cell reference and table used to obtain form id • Form id  Census image on the image database (very restricted access) • Correct claim if match name and address • Check of logic used by intruder

  12. Results Level of confidence in intruder’s claim

  13. Results • 48% claims correct overall • Best success rate for claims made with 60-79% Confidence (67% correct) • Self / family 61% correct (v 36% other) • Very few attribute disclosure claims (<10%) • Tables used most: • Age x sex x industry • Age x sex x marital status • Age x sex x economic activity • Sex x industry x economic activity • Age x sex x health x disability

  14. How could so many claims be wrong? • Non-response • Imputation (both person and item) • Capture error (e.g. write-in date of birth) • Processing (esp. coding from free text) • Respondent error • Record swapping • Intruder error

  15. Conclusions for Census • Fewer than half claims correct • Fewer than half “high confidence” claims correct • How much uncertainty is “sufficient”? • ICO have endorsed this work and said “risk is manageable” • Special attention to the most used tables and their “close relatives” • National Statistician content • Communication strategy important

  16. Conclusions for Intruder Testing • Useful for assessing risk empirically • Considerable resource needed • Need lot of support • Wouldn’t suggest doing for every output • Need assessment of what “success” looks like • Use in conjunction with theoretical work

  17. Any Questions?

More Related