1 / 35

experience-based access management & privacy-preserving record linkage

experience-based access management & privacy-preserving record linkage. elizabeth ashley durham thursday , november 11, 2010. roadmap. experience-based access management privacy-preserving record linkage definition steps in record linkage experiment conclusions

lalasa
Download Presentation

experience-based access management & privacy-preserving record linkage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. experience-based access management & privacy-preserving record linkage elizabethashleydurham thursday, november 11, 2010

  2. roadmap • experience-based access management • privacy-preserving record linkage • definition • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  3. roadmap • experience-based access management • privacy-preserving record linkage • definition • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  4. access management • Least Privilege: How can we limit provider access to only the information required to do their job? • Identity and Access Management (IAM) • ex: role-based access controls • IAM in health care organizations • complex workflow • routine emergencies TRUST 2010

  5. the problem with access controls Ideal Model the problem Enforced Control • study: 43% of providers accessed records for which they did not have permissions TRUST 2010 L. Røstad and N. Øystein. Access control and integration of health care systems: an experience report and future challenges. Proc. Availability, Reliability & Security, 2007; 871-878.

  6. the experience-based access management (EBAM) lifecycle Ideal Model Access Log Expected Model Enforced Control • For more information, see: • USENIX Health Security workshop: http://www.usenix.org/event/healthsec10/ • Copy of the paper: http://seclab.uiuc.edu/pubs/GunterML10.pdf • Video of the presentation: http://www.usenix.org/multimedia/healthsec10gunter C. Gunter, D. Liebovitz, and B. Malin. “EBAM: Experience-Based Access Management for Healthcare”. USENIX HealthSec’10 workshop TRUST 2010

  7. record linkage in surveillance hospital privacy office human resources “Karen Lewis” access logs “Karen Lewis” TRUST 2010

  8. roadmap • experience-based access management • privacy-preserving record linkage • definition • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  9. privacy-preserving record linkage (pprl) set of records from dataholder A set of records from dataholder B TRUST 2010

  10. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  11. steps in record linkage * matches blocking field comparison record pair comparison record pair classification * non-matches * I assume a common schema and method of data standardization. I also assume that the records from an institution have been deduplicated (i.e., record linkage has been applied within each institution such that an individual is represented by only a single record within an institution.) TRUST 2010

  12. field comparison fields: record a: record b: comparison vector: TRUST 2010

  13. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  14. privacy-preserving field comparison experiment the dataset • 1,000 records from the North Carolina Voter Registration database • fields: • 1,000 “corrupted” records • repeated 100 times to examine statistical significance data corrupter TRUST 2010 P. Christen and A. Pudjijono, “Accurate Synthetic Generation of Realistic Personal Information.” Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 2009.

  15. privacy-preserving field comparison experiment • option 1: hash & compare • option 2: secure edit similarity • option 3: bloom filter TRUST 2010

  16. privacy-preserving field comparison option 1: hash & compare record a: record b: comparison vector: SHA-1, “salting” used to prevent dictionary attack TRUST 2010

  17. privacy-preserving field comparison experiment option 2: secure edit similarity • edit distance: the minimal number of insertions, deletions, and substitutions required to convert one string into another • edit similarity: • “secure” edit distance: calculated by iteratively using homomorphic encryption to compute the value of each cell of the matrix used in the dynamic programming algorithm to calculate edit distance TRUST 2010 W. Du, M. J. Atallah, “Protocols for Secure Remote Database Access with Approximate Matching, Technical Report”, CERIAS, Purdue Uni- versity, 2001.

  18. privacy-preserving field comparison experiment option 3: Bloom filters record a record b jon john _j jo on n_ _j jo oh hn n_ h2 h1 α: β: 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1,000 bits & 30 hash functions (all variations of SHA-1, “salting” used to prevent dictionary attack) TRUST 2010 Rainer Schnell, Tobias Bachteler, and JorgReiher. “Privacy-preserving record linkage using Bloom filters,” BMC Medical Informatics and Decision Making (9). 2009

  19. privacy-preserving field comparison experiment run time 2.5 GHz quad core PC with 4GB of memory TRUST 2010 Elizabeth Durham, Yuan Xue, Murat Kantarcioglu, and Bradley Malin. Submitted to Information Fusion. 2010.

  20. privacy-preserving field comparison experiment correctness TRUST 2010 Elizabeth Durham, Yuan Xue, Murat Kantarcioglu, and Bradley Malin. Submitted to Information Fusion. 2010.

  21. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  22. conclusions hash & compare bloom filter secure edit distance accuracy: speed: security: overall: TRUST 2010

  23. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • conclusions • open research questions in record linkage TRUST 2010

  24. open research questions in record linkage distributed centralized TRUST 2010

  25. thanks NLM 2-T15LM07450-06 NIH R01 LM009989 ebam NSF CNS-0964063 (EBAM) NSF CCF-0424422 (TRUST) privacy-preserving record linkage TRUST 2010

  26. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • design • results • open research questions in record linkage blocking field comparison record pair comparison record pair classification TRUST 2010

  27. = match = non-match blocking blocking (first letter of last name) no blocking Jon Smyth, … Jon Smyth, … Karyn Lewis, … KarynLewis, … Joy Beck, … Marty Smith, … Joy Beck, … Marty Smith, … Laura Root, … Laura Root, … John Smith, … John Smith, … Bob Beck, … Bob Beck, … Bob Taylor, … Bob Taylor, … Karen Lewis, … Karen Lewis, … Alice Todd, … Alice Todd, … |A||B| = 25 record pair comparisons 4 record pair comparisons TRUST 2010

  28. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • design • results • open research questions in record linkage blocking field comparison record pair comparison record pair classification TRUST 2010

  29. continuous fellegi-sunter * Note this assumes a uniform distribution of similarity scores. TRUST 2010 Edward H. Porter and William E. Winkler, “Approximate String Comparison and its Effect on an Advanced Record Linkage System”, Research Report RR97/02, U.S. Census Bureau. 1997.

  30. record pair comparison Fellegi-Sunter (FS) • conditional probability vectors: • m[i] = P(a[i] == b[i] | (a,b) is a match)* • u[i] = P(a[i] == b[i] | (a,b) is a non-match) * where i = 1, … , # fields • weight vectors: • agreement weight: wa[i] = log(m[i] / u[i]) • disagreement weight: wd[i] = log(1-m[i] / 1-u[i]) • scoring: calculated once per record linkage over all record pairs calculated for each record pair * The Expectation Maximization (EM) algorithm, or a subset of records for which the true match status is known, can be used to determine these conditional probabilities.

  31. fellegi-sunter conditional probability vectors: weight vectors: TRUST 2010 I. Fellegi and A. Sunter, "A theory for record linkage.” Journal of the American Statistical Society, 1969.

  32. roadmap • experience-based access management • privacy-preserving record linkage • definition • applications • steps in record linkage • experiment • design • results • open research questions in record linkage blocking field comparison record pair comparison record pair classification TRUST 2010

  33. record pair classification match score record pair classification match +3 match +3 non-match +2 non-match +1 non-match +1 non-match +1 0 non-match 0 non-match non-match 0 TRUST 2010

  34. open research questions in record linkage blocking (first letter of last name) no blocking Jon Smyth, … Jon Smyth, … Karyn Lewis, … KarynLewis, … Joy Beck, … Marty Smith, … Joy Beck, … Marty Smith, … Laura Root, … Laura Root, … John Smith, … John Smith, … Bob Beck, … Bob Beck, … Bob Taylor, … Bob Taylor, … Karen Lewis, … Karen Lewis, … Alice Todd, … Alice Todd, … |A||B| = 25 record pair comparisons 4 record pair comparisons = match = non-match TRUST 2010

  35. open research questions in record linkage predicted actual TRUST 2010

More Related