220 likes | 573 Views
Differential Privacy (2). Outline . Using differential privacy Database queries Data mining Non interactive case New developments. Definition. Mechanism: K(x) = f(x) + D, D is some noise. It is an output perturbation method. Sensitivity function.
E N D
Outline • Using differential privacy • Database queries • Data mining • Non interactive case • New developments
Definition Mechanism: K(x) = f(x) + D, D is some noise. It is an output perturbation method.
Sensitivity function • Captures how great a difference must be hidden by the additive noise How to design the noise D? It is actually linked back to the function f(x)
Adding LAP noise Why does this work?
Proof sketch Let K(x) = f(x) + D =r. Thus, r-f(x) has Lap distribution with the scale df/e. Similarly, K(x’) = f(x’)+D=r, and r-f(x’) has the same distribution P(K(x) = r) = exp(-|f(x)-r|(e/df)) P(K(x’)= r) = exp(-|f(x’)-r|(e/df)) P(K(x)=r)/P(K(x’)=r) = exp( (|f(x’)-r|-|f(x)-r|)(e/df)) apply triangle inequality <= exp( |f(x’)-f(x)|(e/df)) = exp(e)
Composition • Sequential composition • Parallel composition --for disjoint sets, the ultimate privacy guarantee depends only on the worst of the guarantees of each analysis, not the sum.
Database queries (PINQ) • Basic aggregate operations • Noisy count • Noisy sum • Noisy average • composition rule • Stable transformation |T(A) - T(B)| <= c|A-B|, and M provides e-diff privacy => Composite computation M(T(x)) is ce-diff privacy
Data mining with differential privacy (paper) • Decision tree • Basic operation: scan through the domain to find the split that maximizes some classification measure • Basic idea of the diff-privacy version • Users interact with the data server to find out required information • These operations can be transformed to counting operations -- apply NoisyCount • Sensitivity of the function is determined by the classification measure
Privacy budget e • User specified total budget e • Composite operations need a specific e’ for each operation
Non interactive differential privacy • Noisy histogram release
New settings • Against an adversary who has access to the algorithm’s internal state • Differential privacy under continual observation