1 / 14

De-identification Risk and Resolution

De-identification Risk and Resolution. Bradley Malin, Ph.D. Assistant Professor Vanderbilt University. De-identified is not Anonymous ( Sweeney 1998, 2000 ). Name Address Date registered Party affiliation Date last voted. Ethnicity Visit date Diagnosis Procedure Medication

Download Presentation

De-identification Risk and Resolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 29e Confrence internationale des commissaires à la protection de la vie prive

  2. De-identification Risk and Resolution Bradley Malin, Ph.D. Assistant Professor Vanderbilt University 29e Confrence internationale des commissaires à la protection de la vie prive

  3. De-identified is not Anonymous(Sweeney 1998, 2000) Name Address Date registered Party affiliation Date last voted Ethnicity Visit date Diagnosis Procedure Medication Total charge Zip Birthdate Sex 87% of the United States is RE-IDENTIFIABLE Hospital Discharge Data Voter List 29e Confrence internationale des commissaires à la protection de la vie prive

  4. DNA Re-identification • Many deployed genomic privacy technologies leave DNA susceptible to re-identification (Malin 2005) • DNA is re-identified by automated methods, such as: • Genotype – Phenotype Inference (Malin & Sweeney, 2000, 2002) 29e Confrence internationale des commissaires à la protection de la vie prive

  5. Genealogy Re-identification(Malin 2006) • IdentiFamily: • software that links de-identified pedigrees to named individuals • Uses publicly available information, such as obituaries, death records, and the Social Security Death Index database to build genealogies 29e Confrence internationale des commissaires à la protection de la vie prive

  6. Genealogy Re-identification(Malin 2006) 29e Confrence internationale des commissaires à la protection de la vie prive

  7. System Susceptibility(Malin, JAMIA 2005) Susceptible Not Susceptible 29e Confrence internationale des commissaires à la protection de la vie prive

  8. Altering Data Does notGuarantee Protection • Science Magazine (Lin et al, 2004) • < 100 “SNPs” make DNA unique • Proposed protection: perturb DNA • i.e., change A with T, etc. • aaaact atacct • Increase perturbation, decrease internal correlations (see graph) • Conclusions • Too much perturbation needed to prevent linkage • Keep records under lock and key DISCLAIMER: Uniqueness Does not Guarantee Privacy will be Compromised Utility (Correlations) Privacy (Perturbation) 29e Confrence internationale des commissaires à la protection de la vie prive

  9. Formal Re-identification Model Already Public Necessary Condition LINKAGE MODELC De-identified Biobank Data Identified Data 2. Certify No Linkage Route Necessary Condition UNIQUENESS Necessary Condition UNIQUENESS Necessary Condition UNIQUENESS 1. Make Data Non-unique 29e Confrence internationale des commissaires à la protection de la vie prive

  10. Formal Protection • k-Map (Sweeney, 2002) • Each shared record refers to at least k entities in the population • k-Anonymity (Sweeney, 2002) • Each shared record is equivalent to at least k-1 other records • k-Unlinkability (Malin 2006) • Each shared record links to at least k identities via its trail • Satisfies k-Map protection model 29e Confrence internationale des commissaires à la protection de la vie prive

  11. Beyond Ad hoc Protections • Perturbation does not guarantee privacy • Alternative: Generalization of data (Lin et al 2004) (Malin 2005) 29e Confrence internationale des commissaires à la protection de la vie prive

  12. Learning Who You Are From Where You Have Been (“Trails”)(Malin & Sweeney, 2001; 2004, Malin & Airoldi 2006) 29e Confrence internationale des commissaires à la protection de la vie prive

  13. Preventing Trails: Cystic Fibrosis Population(1149 samples) BEFORE STRANON 100% Samples In Repository AFTER STRANON 0% Samples k-Re-identified 29e Confrence internationale des commissaires à la protection de la vie prive

  14. Benefit: Quantified Risk Forced Setting Initial Setting • Change in re-identification risk • Shift burden of increased risk to requesting analyst • Ties together legal and computational models Requested Quantity 29e Confrence internationale des commissaires à la protection de la vie prive

More Related