1 / 18

Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access

Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access. Jane Bambauer James E. Rogers College of Law University of Arizona. The Data Commons. Information collected by the government tax information, epidemiological data, census surveys,

tamira
Download Presentation

Tragedy of the Deidentified Data Commons An Appeal for Transparency and Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tragedy of theDeidentified Data CommonsAn Appeal for Transparency and Access Jane Bambauer James E. Rogers College of LawUniversity of Arizona

  2. The Data Commons Information collected by the government tax information, epidemiological data, census surveys, educational records, home mortgage data Information collected by private companies Anonymized and released*

  3. The Anonymization Problem • Research subjects can be reidentified in anonymized databases “with astonishing ease.” AOL Re-identification of Gov. Weld Netflix re-identification • Every privacy law must be rewritten to eliminate dependence on anonymization and to restrict access to all data (even deidentified data) without consent Paul Ohm, Broken Promises of Privacy 57 UCLA L. REV. 1701

  4. Save the Data Commons The Data Commons has been used to: • Detect housing and employment discrimination • Debunk the myth of the “welfare queen” • Inform the healthcare and mortgage lending policy debates • Correct longstanding misconceptions about crime and law enforcement • Lots more… Jane Yakowitz, Tragedy of the Data Commons

  5. Hazards of Covert Noise-Adding

  6. Hazards of Covert Noise-Adding

  7. Exaggerated Risks of ReidentificationThe Gov. Weld Example

  8. Exaggerated Risks of ReidentificationThe Gov. Weld Example

  9. Exaggerated Risks of ReidentificationThe Gov. Weld Example

  10. Gov. Weld Reidentification Latanya Sweeney Collected Gov. Weld’s voter registration information and publicly available hospital data Only one hospital patient matched Gov. Weld’s DOB, zip, and gender Conclusion from analysis of US Census data: 87% can be uniquely identified from DOB, zip, and gender Golle recalculations: 63% are unique using DOB, zip, and gender

  11. Daniel Barth-Jones, “Reidentification” of Governor William Weld

  12. Sweeney et al. 2013 PGP Study 579 Personal Genome Project participants provided their DOB, zip code, and gender Using voter registration records and other commercial data sources, Sweeney et al. were able to reidentify 28%(accuracy unclear)

  13. 2009 ONC Study Out of 15,000 HIPAA-compliant records, 2 could be reidentified .013% Chance of Reidentification For comparison’s sake, chance of dying from an auto accident this year: .017%

  14. Total Number of Known Malicious Reidentifications 0 or 1*

  15. If I Were a Malicious Intruder… 3,101 reported data breaches in the U.S. (about half a billion records) 700 reported breaches of health records

  16. If I Were a Malicious Intruder… Sift through Garbage Make Inferences from Facebook Profiles Swab a Coffee Cup

  17. What We Have to Lose • Fewer Opportunities for Replication • Fewer Voluntary Research Databases • Fewer Involuntary Public Databases • Increased Regulatory Precautions More Status Quo Bias

  18. Vioxx “What If” Study From Richard Platt’s FDA testimony in 2007 Vioxx approved May, 1999 Removed from market September, 2004 (64 months) Data on 7 million patients: 34 months Data on 100 million: 3 months 88,000-139,000 avoidable heart attacks 27,000-55,000 avoidable deaths

More Related