280 likes | 372 Views
Towards an end-to-end architecture for handling sensitive data. Hector Garcia-Molina Rajeev Motwani and students. 1. DB Perspective. Performance Preservation Distribution (P2P) Bad Guys: eavesdrop corrupt Trust. DB Perspective. Preservation. goal. +. easy. preservation. easy. -.
E N D
Towards an end-to-end architecture forhandling sensitive data Hector Garcia-Molina Rajeev Motwani and students 1
DB Perspective • Performance • Preservation • Distribution (P2P) • Bad Guys: • eavesdrop • corrupt • Trust
DB Perspective • Preservation goal + easy preservation easy - - + privacy
Privacy Spectrum • Prevention • Detection • Containment
Prevention: Our Work • Privacy-Preserving OLAP • Distributed Architecture for Secure DBMS (P) • Data Preservation in P2P Systems • P2P Trust and Reputation Management (P) • P2P Privacy Preserving Indexing (P)
Distributed Architecturefor Secure DBMS • Motivation: Outsourcing • Secure Database Provider (SDP) Encrypt ServiceProvider Client
Performance Problem Encrypt ServiceProvider Client Query Q Q’ Client-side Processor Answer “Relevant Data” Problem: Q’ “SELECT *”
The Power of Two DSP1 Client DSP2
Basic Idea { CC# } { CC#, expDate, name } { expDate, name }
Another Example { salary + rand } { salary } { rand }
The Power of Two DSP1 Q1 Query Q Client-side Processor Q2 DSP2 Key: Ensure Cost (Q1)+Cost (Q2) Cost (Q)
Challenges • Find a decomposition that • Obeys all privacy constraints • Minimizes execution cost for given workload • For given query, find good plan
Most popular queries: • Select on a, b • Select on b, c R1(id, a, b) R2(id, b, c) Example R(id, a, b, c), privacy constraint: { a, b, c } R1(id, a, b) R2(id, b, c) … R1(id, a) R2(id, b, c) R1(id, a, b) R2(id, c) R1(id, a, c) R2(id, b, c)
Detection: Our Work • Simulatable Auditing (P) • k-Anonymity • algorithms and hardness
Containment: Our Work • Paranoid Platform for Privacy Preferences (P) • Entity Resolution
Containment • Trusting • privacy policies • Paranoid
Example: Trusting • Example P3P Policies: • Current purpose: completion and support of the recurring subscription activity • Recipients: DealsRUs and/or entities acting as their agents or entities for whom DealsRUs are acting as an agent... (1) browse policy (2) give info alice (3) cross fingers dealsRus
Example: Email (1) temp a12@w (2) a12@w (3) To:a12@w (4) To: a@z alice’sagent alicea@z dealsRus
API API Strategy/Reference Implementation P4P: Paranoid Platformfor Privacy Preferences Framework Data/Control Types: t1 ... tn
Private Information sharable accountable no integration control no predicate input limited time use complete privacy function copy identifier service handle input to predicate ownership individual organization
Entity Resolution • Applications: • mailing lists, customer files, counter-terrorism, ... e2 e1 N: a A: b CC#: c Ph: e N: a Exp: d Ph: e
1.0 1.0 0.7 Nm: Alice Ad: 32 Fox Ph: 5551212 Nm: Alice Ad: 32 Fox Ph: 5551212 Nm: Alice Ad: 32 Fox Ph: 5551212 Privacy Alice 1.0 1.0 Nm: Alice Ad: 32 Fox Nm: Alice Ad: 32 Fox Ph: 5551212 Ad: 14 Cat Bob
1.0 0.7 Nm: Alice Ad: 32 Fox Ph: 5551212 Nm: Alice Ad: 32 Fox Ph: 5550000 Leakage Alice Bob L = 0.6 (between 0 and 1)
1.0 Nm: Alice Ad: 32 Fox Ph: 5551212 Multi-Record Leakage Alice r1, L = 0.9 r2, L = 0.8 r3, L = 0.7 Bob LL = 0.9 (between 0 and 1, e.g., max L)
Q1: Added Vulnerability? p Alice r1 r2 r3 r4 Bob r4 may cause Bob’s records to snap together! ΔLL = ??
Q2: Disinformation? p Alice r1 r2 r3 r4 (lies) Bob What is most cost effective disinformation? ΔLL = ??
Q3: Verification? p Alice hypothesis h (0.6) r1, 0.9 r2, 0.8 r3, 0.7 ... Bob What is best fact to verify to increase confidence in hypothesis?
Privacy Spectrum • Prevention • Detection • Containment