1 / 20

Online Auditing - How may Auditors Inadvertently Compromise Your Privacy

Online Auditing - How may Auditors Inadvertently Compromise Your Privacy. Kobbi Nissim Microsoft. With Nina Mishra HP/Stanford Work in progress. q = (f ,i 1 ,…,i k ). f (d i1 ,…,d ik ). The Setting. Statistical database. Dataset: d={d 1 ,…,d n } Entries d i : Real, Integer, Boolean

Download Presentation

Online Auditing - How may Auditors Inadvertently Compromise Your Privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Auditing - How may Auditors Inadvertently Compromise Your Privacy Kobbi Nissim Microsoft With Nina MishraHP/Stanford Work in progress

  2. q = (f ,i1,…,ik) f (di1,…,dik) The Setting Statisticaldatabase • Dataset: d={d1,…,dn} • Entries di: Real, Integer, Boolean • Query: q = (f ,i1,…,ik) • f : Min, Max, Median, Sum, Average, Count… • Bad users will try to breach the privacy of individuals

  3. The Data Privacy Game: an Information-Privacy Tradeoff f i f f • Private functions: • Want to hide i(d)=di • Information functions: • Want to reveal query answers f(di1,…,dik) • Major question: what may be computed over d (and given to users) without breaching privacy? • Confidentiality control methods • Perturbation methods: give `noisy’ answers • Query restriction methods: limit the queries users may post, usually imposing some structure (e.g. size/overlap restrictions)

  4. Auditing • [AW89] classify auditing as a query restriction method: • “Auditing of an SDB involves keeping up-to-date logs of all queries made by each user (not the data involved) and constantly checking for possible compromise whenever a new query is issued” • Partial motivation:May allow for more queries to be posed, if no privacy threat occurs • Early work: Hofmann 1977, Schlorer 1976, Chin, Ozsoyoglu 1981, 1986 • Recent interest:Kleinberg, Papadimitriou, Raghavan 2000, Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003

  5. Statisticaldatabase Auditor Auditing Here’s the answer OR Query denied (as the answer would cause privacy loss) Here’s a new query: qi+1 Query log q1,…,qi

  6. Design choices in Prior Work (1) • Privacy definition: • Privacy breached (only) when a database entry may be deduced fully, or within some  accuracy • These privacy guarantees do not generally suffice: • Should take into account: Adversary’s computational power, prior knowledge, access to other databases… • Exact answers given • Auditors viewed as a way to give `quality’ answers???

  7. Design choices in Prior Work (2) 3. Which information is taken into account in the auditor decision procedure: • Decision made based on queries q1,…,qi,qi+1and their answers a1,…,ai,ai+1 • Denials ignored 4. Offline vs. Online: • Offline auditing: queries and answers checked for compromise at the end of the day • Only detect breaches • Online auditing: answer/deny queries on the fly • Prevent breaches just before they happen

  8. Auditor Example 1: Sum/Max auditing • di real, sum/max queries, privacy breached if some di learned q1 = sum(d1,d2,d3) sum(d1,d2,d3) = 15 q2 = max(d1,d2,d3) Denied (the answer would cause privacy loss) Oh well…

  9. Data Queries Breach Complexity Sum/Max [Chin] real Sum/max di learned NP-hard Boolean [KPR00] 0/1 Sum --”-- NP-hard* Max [KPR00] Real Max --”-- PTIME Interval based [LWWJ02] di[a,b] sum di within accuracy  PTIME Generalized results [JK03] NP-hard /PTIME Some Prior Work on Auditors * Approx version in PTIME Can we use the offline version for online auditing?

  10. Auditor … After Two Minutes … • di real, sum/max queries, privacy breached if some di learned q1 = sum(d1,d2,d3) sum(d1,d2,d3) = 15 q2 = max(d1,d2,d3) Denied (the answer would cause privacy loss) There must be a reason for the denial… q2 is denied iff d1=d2=d3 = 5 I win! Oh well…

  11. Auditor Example 2: Interval Based Auditing • di  [0,100], sum queries,  =1 (PTIME) q1 = sum(d1,d2) Sorry, denied q2 = sum(d2,d3) sum(d2,d3) = 50 d1,d2  [0,1] d3  [49,50] Denial  d1,d2[0,1] or [99,100]

  12. Colonel Oliver North, on the Iran-Contra Arms Deal: On the advice of my counsel I respectfully and regretfully decline to answer the question based on my constitutional rights. • David Duncan, Former auditor for Enron and partner in Andersen: Mr. Chairman, I would like to answer the committee's questions, but on the advice of my counsel I respectfully decline to answer the question based on the protection afforded me under the Constitution of the United States. Sounds Familiar?

  13. dn-1 dn d2 d4 d6 d1 d5 d7 d8 … d3 q2 = max(d1,d2,d3) q2 = max(d1,d2) Auditor Max Auditing • di real q1 = max(d1,d2,d3,d4) M1234 M123 / denied If denied: d4=M1234 M12 / denied If denied: d3=M123

  14. Auditor Adversary’s Success q1 = max(d1,d2,d3,d4) If denied: d4=M1234 q2 = max(d1,d2,d3) Denied with probability 1/4 q2 = max(d1,d2) If denied: d3=M123 Denied with probability 1/3 Success probability: 1/4 + (1- 1/4)·1/3 = 1/2 Recover 1/8 of the database!

  15. d2 dn-1 dn … d8 d7 d5 d3 d6 d1 d4 q1 = sum(d1,d2) q2=sum(d2,d3) q2=sum(di,dj,dk) Auditor Boolean Auditing? • di Boolean 1 / denied 1 / denied … qi denied iff di = di+1  learn database/complement Let di,dj,dk not all equal, where qi-1, qi,qj-1, qj, qk-1, qk all denied 1 / 2 Recover the entire database!

  16. Possible assignments to {d1,…,dn} Assignments consistent with (q1,…qi, a1,…,ai) qi+1 denied Two Problems • Obvious problem: denied queries ignored • Algorithmic problem: not clear how to incorporate denials in the decision • Subtle problem: • Query denials leak (potentially sensitive) information • Users cannot decide denials by themselves

  17. “Safe” “Unsafe” “Safe” q1,…,qi, qi+1 a1,…,ai q1,…,qi, qi+1 a1,…,ai, ai+1 q1,…,qi, qi+1 A Spectrum of Auditors Size overlap restriction Algebraic structure > privacy < utility *Note: can work in “unsafe” region, but need to prove denials do not leak crucial information

  18. q1,…,qi Statisticaldatabase q1,…,qia1,…,ai qi+1 qi+1 Simulator Auditor Deny/answer Deny/answer Simulatable Auditing* An auditor is simulatable if a simulator exists s.t.:  Simulation  denials do not leak information * `self auditors’ in [DN03]

  19. Possible assignments to {d1,…,dn} Assignments consistent with (q1,…qi, a1,…,ai ) qi+1 denied/allowed Why Simulatable Auditors do not Leak Information?

  20. Summary • Improper usage of auditors may lead to privacy breaches, due to information leakage in the decision procedure. • Cell suppression / some k-anonymity methods should be checked similarly • Should make sure offline auditors do not leak information in decision • Simulatable auditors provably don’t leak information • Give best utility while still “safe” • A launching point for further research on auditors • Further research: • Auditors with more reasonable privacy guarantees

More Related