340 likes | 361 Views
This work in progress outline discusses private approximation protocols, with a focus on the private approximation of L2 distance. The goal is to achieve sublinear communication while maintaining security and efficiency.
E N D
Efficient Private Approximation Protocols Piotr IndykDavid Woodruff Work in progress
Outline • Private approximation of L2 distance • Private near neighbor • Private approximate near neighbor
Secure communication Alice Bob a {0,1}nb {0,1}n • Want to compute some function F(a,b) • Security: protocol does not reveal anything except for the value F(a,b) • Semi-honest: both parties follow protocol • Malicious: parties are adversarial • Efficiency: want to exchange few bits
Secure Function Evaluation (SFE) • [Yao, GMW]: If F computed by circuit C, then F can be computed securely with O~(|C|) bits of communication • [GMW] + … + [NN]: can assume parties semi-honest • Semi-honest protocol can be compiled to give security against malicious parties • Problem: circuit size at least linear in n *O~() hides factors poly(k, log n)
Secure and Efficient Function Evaluation • Can we achieve sublinear communication? • Ideally: secure computation with communication comparable to insecure case • With sublinear communication, many interesting problems can be solved only approximately. • What does it mean to have a private approximation?
Private Approximation • [FIMNSW’01]: A protocol computing an approximation G(a,b) of F(a,b) is private, if each party can simulate its view of the protocol given the exact value F(a,b) • Note: not sufficient to simulate non-private G(a,b) using SFE • Example: • Define G(a,b): • bin(G(a,b))i =bin((a,b))i if i>0 • bin(G(a,b))0=a0 • G(a,b) is a 1 -approximation of (a,b), but not private
Concrete Pitfall: Dimension Reduction • A basic problem: Hamming distance (a,b) • Approximate decision version: with prob. 1-, • If (a,b)≤r, answer NO • If (a,b)≥r(1+) , answer YES • [Kushilevitz-Ostrovsky-Rabani’98]: • Create mn binary matrix D, where Pr[Dij=1]= 1/(2r) for m= O~(log 1/ / 2) • Exchange Da, Db (mod 2) • Answer YES if wt[D(a-b)]>r’, r’ function of r, NOTE: This protocol was not designed to be private
Non-Privacy of KOR • Let x = a – b. If, wt(x) = r, r log n ¼ m then can recover x from D, Dx in O(mn) time! • Algorithm: for j=1…n, estimate Pr[<di, x> =1| dij =1] = Pr[<di, x> =1 dij=1]/Pr[dij =1] • If xj=1 then Pr[<di, x> =1|dij =1] is high • If xj=0 then Pr[<di x> =1|dij=1] is low
Approximating Hamming Distance • [FIMNSW’01]: A private protocol with complexity O~(n1/2/ ) • wt(x) small: compute wt(x) using O~(wt(x)) bits • wt(x) high: sample O~(n/wt(x))xi, estimate wt(x) • Our result: • Complexity: O~(1/2) bits • Works even for L2 norm, i.e., estimates ||x||2for a,b {1…M}n * O~() hides factors poly(k, log n, log M, log 1/)
Crypto Tools • SFE of circuits [Yao’86]: O~(|circuit|) communication • Efficient SPIR or OT1n: • Alice has A[1] … A[n] 2 {0,1}m , Bob has i 2 [n] • Goal: Bob privately learns A[i] and that’s it • Can be done using O~(m) communication [CMS99, NP99] • Circuits with ROM [Naor, Nissim’01]: • Standard AND/OR/NOT gates • Lookup gates: • In: i • Out: Mgate[i] • Takes care of the security of computation: • begin secure … end secure • Can just focus on privacy of the output Communication at most O~(m|C|)
High-dimensional tools • Random projection: • Take a random orthonormal nn matrix D, that is ||Dx|| = ||x|| for all x. • There exists c>0 s.t. for any xRn, i=1…n Pr[ (Dx)i2 > ||Dx||2/n * k] < e-ck
Approximating ||a-b||2 • Recall: • Alice has a 2 [M]d, Bob has b 2 [M]d • Goal: estimate ||x||2, x=a-b
Algorithm • Alice and Bob create random orthonormal matrix D such that, for each i=1…n (Dx)i2 < k||x||2/n • T=M2 n+1 • Repeat • {Assertion: ||x||2≤ T} • Invoke PRIVATESAMPLE to get L=O~(1/ 2) independent bits zisuch that Pr[zi=1]=||Dx||2/(Tk) • T = T/2 • Until Σi zi ≥ L/(4k) • Output E= Σi zi /L * 2Tk as an estimate of ||x||2 Correctness: • Unbiased estimator • High probablity from Chernoff bound SECURE!
PRIVATESAMPLE Generate independent bits zi with E[zi] = ||Dx||2/(Tk) • P=Tk/n • Pick random t[n] • Retrieve (Da)t, (Db)t • Compute (Dx)t = (Da)t - (Db)t • Define v=[(Dx)t]2 • If v ≤ P then generate z s.t. Pr[z=1]=v/P Else output fail • Output z Correct as long as (Dx)2i < Tk/n for each i=1…n SECURE!
Algorithm, again • Alice and Bob create random* orthonormal** matrix D such that, for each i=1…n (Dx)i2 < ||x||2/n * k • T=M2 n+1 • Repeat • {Assertion: ||x||2≤ T} • Invoke PRIVATESAMPLE to get L=O~(1/ 2) independent bits zisuch that Pr[zi=1]= ||Dx||2/Tk { Works as long as (Dx)2i < Tk/n for each i=1…n} • T=T/2 • Until Σi zi ≥ L/(4k) • Output E= Σi zi /L * 2Tk as an estimate of ||x||2 If Assertion not true, then Pr[zi=1]>1/(2k) E[Σi zi ] > L/(2k) >> L/(4k)
Simulation SIMULATION • Repeat • Choose L independent bits zi such that Pr[zi=1]= ||x||2/Tk • T=T/2 • Until Σi zi ≥ (L/k) • Output E= Σi zi /L * 2Tk as an estimate of ||x||2 ALGORITHM • Repeat • {Assertion: ||x||2≤ T} • Invoke PRIVATESAMPLE to get L independent bits zi such that Pr[zi=1]= ||Dx||2/Tk • T=T/2 • Until Σi zi ≥ (L/k) • Output E= Σi zi /L * 2Tk as an estimate of ||x||2 Recall: • ||Dx||=||x|| Communication: O~(1/2)
Private Near Neighbor Alice Bob P = p1, p2, …, pn2 {1, 2, …, U}d = [U]d q 2 [U]d • Distance function:f(x,y) • Correctness: Bob learns mini f(q, pi) • Privacy: Alice learns nothing, Bob learns nothing else • Goal: Minimize communication
Private Near Neighbor • n points, dimension d, universe [U] • [DA] needs 3rd party, we don’t • Approach: homomorphic encryption + secure function evaluation (SFE)
“Coordinate-wise” distance functions Alice Bob P = p1, p2, …, pn2 [U]d q 2 [U]d “Coordinate-wise” distance functions: f(a,b) = fi(ai, bi) Bob:1. For each coordinate, create a degree-(U-1) polynomial gj(x) = i ai,j xi such that gj(u) = fj(qj, u) for all u 2 [U] 2. Generate (SK, PK) for Paillier Encryption scheme. Send PK and EPK(ai, j) for all i,j Alice:1. For all i, E(j gj(pi,j)) = E(f(q, pi)) SFE: Inputs: Alice – E(f(q, pi)) Bob - SK 1. Bob gets mini DSK (E(f(q, pi))) E(x), E(y) -> E(x + y) E(x), c -> E(cx)
Generic distance functions • Security: 1. Replace SFE with oracle 2. Alice View indistinguishable from PK, E(0), E(0), …, E(0) – E semantically secure 3. Bob View just = output • Efficiency: 1. Send polynomials = O~(dU) 2. SFE = O~(n) (simple circuit)
Private Near Neighbor • n points, dimension d, universe [U] • Alice x1, …, xn2 {0,1}d , Bob y1, …, yn2 {0,1}d , Threshold t • Bob gets all xi s.t. (xi, yj) < t for some j • Communication: O~(n2 + nd2). Resolves open question of [FNP04]: • [FNP04] achieve O~((d choose t)nt) May be superpolynomial in n (homomorphic tricks)
Private Near Neighbor • Drawback: Protocols depend linearly on # points n • Necessary? Not if algebraically homomorphic E exists • Our approach: solve the approximate problem
Private c-Approximate Near Neighbor Alice has P = {p1, …, pn} {0,1}d, Bob has q {0,1}d Notation: Pr = P B(q, r) Correctness: Prnonempty Bob learns some element of Pcr Privacy: Bob’s view simulatable given q and Pcr Pcr Pr
Private Approximate Near Neighbor • Definition Remarks: Privacy: Don’t care what Bob gets as long as it follows from Pcr Simulator gets Pcr Correctness: Don’t specify anything if Pr empty, but view still simulatable • Our results: - O~(n1/2 + d) - If Bob just wants some coordinate of an element of Pcr, then improve to O~(n1/2 + polylog(d))
Private Approximate Near Neighbor Two approaches: 1. Dimensionality Reduction in Hamming Cube [KOR98] 2. Locality Sensitive Hashing [IM98] This talk: protocol using #1
Dimensionality Reduction • [KOR]: Let A be random m times d binary matrix, m = O(log d /2) • Then there is a separator r’ s.t. with probability 1-1/n2 , for any p,q {0,1}d 1. (p,q) > cr (Ap, Aq) > r’ 2. (p,q) · r (Ap, Aq) < r’ Idea: Alice 1. Applies A to P dimension small 2. Enumerates all w {0,1}m, forms array: B[w]={p 2 P s.t. (Ap, w) < r’} 3. Use Oblivious ROM
Dimensionality reduction protocol 2. Agree on k matrices A1, …, Ak 3. Create array Bi based on Ai 4. Bi[p] contains any n1/2 points p’ 2 P s.t. (Aip’, p) < r’ 5. Alice sets ROM to be the Bis Protocol: 1. Randomly sample O~(n1/2) points P1 2. If |Pcr| > n1/2, then P1Å Pcr;, w.h.p. Pcr 6. If P1Å Pcr;, SFE outputs a random element of P1. Otherwise, SFE uses [i B i[Aiq] to output a random element of Pr
Dimensionality Reduction Analysis Properties: 1. If |Pcr| > n1/2, we output random element of Pcr ,w.h.p. 2. If |Pcr| < n1/2 , by properties of A, for any p Pr , PrA [8 p 2 Pr, (Ap, Aq) < r’ and8 p 2 Pcr, (Ap, Aq) > r’] > 1- 1/n 3. Since bucket size is n1/2 and |Pcr| < n1/2, pBi[Aiq], Pr i Bi[Aiq] Correctness: If |Pcr| > n1/2, output element from Pcr Else output an element from Pr
Dimensionality Reduction Analysis • Communication: 1. Sampling O~(n1/2) elements to ensure |Pcr| < n1/2 2. OT on O~(1) buckets of size n1/2 Thus, balanced steps 1 & 2 O~(dn1/2) total communication • Simulatability: • Output either a random element of Pcr , or a random • element of Pr
Dimensionality Reduction Analysis • Dependence on d: • 1. Homomorphic encryption: O~(d + n1/2) • 1. Bob sends E(q1), …, E(qd) • 2. Alice computes E((pi, q)) • - Uses these for sampling and bucketing • 2. Reduce to O~(polylog(d) + n1/2) if Bob just wants • a coordinate of point in Pcr– use approximations
Conclusions • Extensions: Can achieve O(n1/3 + d) communication if you allow the protocol to “leak” k bits of information • Open problems: 1. Polylogarithmic Private Approximation of other distances 2. More efficient protocols for exact near neighbor. Tricks for PIR may be useful 3. Polylogarithmic c-approx NN protocol