200 likes | 375 Views
Privacy-Preserving Data Sharing. Michael Siegenthaler Ken Birman Cornell University. ID. ID. ID. Introduction. Today, personal data is typically stored electronically But systems at distinct organizations have no way to communicate with each other. System Model. Acme Food and Drug.
E N D
Privacy-PreservingData Sharing Michael Siegenthaler Ken Birman Cornell University
ID ID ID Introduction • Today, personal data is typically stored electronically • But systems at distinct organizations have no way to communicate with each other
System Model Acme Food and Drug General Hospital Special Treatment Clinic, Inc. Legacy databases (Each stored at at a data owner)
Example Query • Drug interaction check at pharmacy • A pharmacist is dispensing a drug, doesn’t know what else the patient may be taking • Patient’s medical record is stored at primary care provider and various specialists • Is it safe for the patientto take this drug?
Guarantees • Data privacy • E.g. pharmacist receives yes/no answer, not the underlying data • Query privacy • E.g. hospital does not learn which drug is currently being dispensed • Anonymous communication • E.g. hospital and pharmacy do notlearn each other’s identities
Anonymous Communication • Onion skin routing • Providers Pi • Encryption function E • Public keys KPi • Example: • Reference to patient 34 at Provider 2 routed through provider Provider 1
Requirements • “Locate” remote records • Translate a real-world identifier (name, SSN, DOB...) into a data handle, an onion skin route that can be used to communicate with the providers where the data owners • Execute the desired query • Use data handles to performa privacy-preserving query
Global Search Mechanism Search for user with SSN 343-56-7878 • Hierarchy of provider groups • Each group has a designated contact who tracks its membership
Bloom Filters M = 12K = 3 SSN1 = 987-65-4321 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 hash1(SSN1) = 2 1 0 1 2 3 4 5 6 7 8 9 10 11 hash2(SSN1) = 4 1 hash3(SSN1) = 8 1 SSN3 = 444-88-2222 SSN3 = 444-88-2222 Insert SSN1 hash1(SSN3) = 4 ? hash1(SSN3) = 4 ? SSN2 = 112-33-4455 Insert SSN2 hash2(SSN3) = 3 ? hash2(SSN3) = 3 ? hash1(SSN2) = 3 1 hash3(SSN3) = 8 hash3(SSN3) = 8 ? ? hash2(SSN2) = 10 1 hash3(SSN2) = 8 1 Yes. (false positive!) No! Does a record for SSN3 exist?
Using False Positives • Adjust Bloom filter parameters for desired trade-off between privacy and performance
Query Execution Example: A pharmacy checking for drug interactions Random Intermediary General Hospital Acme Food and Drug Record access request Drug interaction query Prescription record with name/address stripped Yes/no answer • All messages are sent anonymously using a MIX • The hospital does not learn the nature of the query • The pharmacy does not learn which other drugs the patient is taking • The random intermediary cannot do anything nefarious with the data it has received, since that data is out of context
Query to find drug interactions Query formulated at the pharmacy: SELECT EXISTS ( SELECT * FROM conflicts CROSS JOIN nonces INNER JOIN remote(drug_history) ON nonces.nonce = drug_history.nonce WHERE conflicts.drug = drug_history.drug );
mix_host Split query: data gathering Query sent to the data owner(s): SEND ( SELECT nonce,drug FROM drug_history WHERE drug_history.nonce = Ω(34) );
Split query: joining Query executed at the third-party MIX host: SELECT EXISTS ( SELECT * FROM query_table INNER JOIN drug_history ON query_table.nonce = drug_history.nonce WHERE conflicts.drug = drug_history.drug );
Answering the query mix_host_1 (on hospital’s behalf) Pharmacy (conflict found) Is there a conflict? YES mix_host_2 (on other pharmacy’s behalf) (no conflict here)
Conclusion and Future Work • Selective sharing of personal information across distributed databases • Data privacy • Query privacy • Anonymous communication • Working on: how to enforce a policy on which data may be revealed to whom • Also: how to prevent data mining attacks?