310 likes | 571 Views
Keyword Search and Oblivious Pseudo-Random Functions. Mike Freedman NYU Yuval Ishai, Benny Pinkas, Omer Reingold. X 1. X 2. X 3. X 4. …. X n. Background: Oblivious Transfer. Oblivious Transfer (OT) [R], 1-out-of-N [EGL]: Input: Server: x 1 ,x 2 ,…,x n Client: 1 ≤ j ≤ n Output:
E N D
Keyword Search and Oblivious Pseudo-Random Functions Mike Freedman NYU Yuval Ishai, Benny Pinkas, Omer Reingold
X1 X2 X3 X4 … Xn Background: Oblivious Transfer • Oblivious Transfer (OT) [R], 1-out-of-N [EGL]: • Input: • Server: x1,x2,…,xn • Client: 1 ≤ j ≤ n • Output: • Server: nothing • Client: xj • Privacy: • Server learns nothing about j • Client learns nothing about xi for i ≠ j • 4 • Well-studied, good solutions: O(n) overhead j Xj
Background: Private Information Retrieval (PIR) • Private Information Retrieval (PIR) [CGKS,KO] • Client hides which element retrieved • Client can learn more than a single xj • o(N) communication, O(N) computation • Symmetric Private Information Retrieval (SPIR) [GIKM,NP] • PIR in which client learns onlyxj • Hence, privacy for both client and server • “OT with sublinear communication”
Motivation: Sometimes, OT is not enough • Bob (“Application Service Provider”) • Advises merchants on credit card fraud • Keeps list of fraudulent card numbers • Alice (“Merchant”) • Received a credit card, wants to check if fraudulent • Wants to hide credit-card details from Bob, vice-versa • Use OT? • Table of 1016 ≈ 253 entries, 1 if fraudulent, 0 otherwise?
(x1,p1) (x2,p2) … (xn,pn ) Server: Client: w Client output: (xj ,pj ) iff w=xj Keyword Search (KS): definition • Input: • Server: database X={ (xi,pi ) } , 1 ≤ i ≤ N • xiis a keyword(e.g. number of a corrupt card) • pi is the payload(e.g. why card is corrupt) • Client:search wordw(e.g. credit card number) • Output: • Server: nothing • Client: • piif i : xi = w • otherwise nothing
Keyword Search from data structures? [KO,CGN] • Take any efficient query-able data structure • Hash table, search tree, trie, etc. • Replace direct query with OT / PIR • Achieves client privacy We’re done?
… Keyword Search from hashing + OT [KO] • Use hash function H tomap (xi,pi) to bin H(xi) • Client uses OT to read bin H(w) • Multiple per bin: no server privacy: client gets > 1 elt • One per bin, N bins: no server privacy: H leaks info • One per bin, >> N bins: not efficient
Keyword Search • Variants • Multiple queries • Adaptive queries • Allowing setup • Malicious parties • Prior Work • OT + Hashing = KS without server privacy [KO] • Add server privacy using trie and many rounds [CGN] • Adaptive KS [OK] • But, setup with linear communication, RO model, one-more-RSA-inversion assumption
Keyword Search: Results • Specific protocols for KS • One-time KS based on OPE (homomorphic encryption) • First 1-round KS with sublinear communication • Adaptive KS by generic reduction • Semi-private KS + oblivious PRFs • New notions and constructions of OPRFs • Fully-adaptive (DDH- or factoring-specific) • T-time adaptive (black-box use of OT)
C S w P(x) = Σaixi P(w) Specific KS protocols using polynomials • Tool: Oblivious Polynomial Evaluation (OPE) [NP] • Privacy: Server: nothing about w. Client: nothing but P(w)
1-round KS protocol using polynomials • OPE implementation based on homomorphic encryption • Given E(x), E(y), can compute E(x+y), E(c·x), w/o secret key • Server defines on input X={(xi,pi )}, • Z(x) = r · P(x) + Q(x) , with fresh random r xi If xi X : 0 + pi |0k If xi X : rand • Client/server run OPE of Z(w), overhead O(N) • C sends: E(w), E(w2), …, E(wd), PK • S returns: E(r·Σpi w i + Σqi w i ) = E(r·P(w) + Q(w)) = E(Z(w))
Reducing the overhead using hashing… • Client sends input for L OPE’s of degree m • Server has E(Z1(w)), … ,E(ZL(w)) • Client uses PIR to obtain OPE output from bin H(w) • Comm: O(m = log N) + PIR overhead (polylog N) • Comp: O(N) server, O(m = log N) client publichash function Hm independentof X … L bins Z2 Z1 Fresh random r for Zj(x) = r · Pj(x) + Qj(x) m
What about malicious parties? • Efficient 1 round protocol for non-adaptive KS • Only consider privacy: server need not commit or know DB • Similar relaxation used before in like contexts (PIR, OT) • Privacy against a malicious server? • Server only sees client’s interaction in an OT / PIR protocol • Malicious clients? • Message in OPE might not correspond to polynomial values • Can enforce correct behavior with about same overhead • 1 OPE of degree-m polynomials → m OPEs of linear poly’s
Keyword Search based on Oblivious Evaluation of Pseudo-Random Functions
Semi-Private Keyword Search • Goal: Obtain KS from semi-private KS + OPRF • Semi-Private Keyword Search (PERKY [CGN]) • Provides privacy for client but not for server • Similar privacy to that of PIR • Examples • Send database to client: O(N) communication • Hash-based solutions + PIR to obtain bin • Use any fancy data structure + PIR to query
C S x k Fk(x) Oblivious Evaluation of Pseudo-Random Functions • Pseudo-Random Function: Fk : {0,1}n {0,1}n • Keyed by k (chooses a specific instantiation of F ) • Without k, the output of Fk cannot be distinguished from that of a random function • Oblivious evaluation of a PRF (OPRF) • Client: PRF output, nothing about k • Server: Nothing
key, payload masked by PRF KS from Semi-Private KS + OPRF (xi, pi ) X, Let x’i | p’i ← Fk(xi ) Let( xi, pi )← (x’i , pi p’i) S chooses k, defining Fk(·) • Client • Uses OPRF to compute x’ | p’ ← Fk(w) • Uses semi-private KS to obtain ( xi, pi )where xi = x’ • If entry in database, recovers pi = pi p’
key, payload masked by PRF KS from Semi-Private KS + OPRF (xi, pi ) X, Let x’i | p’i ← Fk(xi ) Let( xi, pi )← (x’i , pi p’i) S chooses k, defining Fk(·) • Security • Preserved even if client obtains all pseudo-database… • Requires that client can’t determine output of OPRF other than at inputs from legitimate queries
Weaker OPRF definition suffices for KS • Strong OPRF: Secure 2PC of PRF functionality • No info leaked about key k for arbitrary fk, other than what follows from legitimate queries • Same OPRF on multiple inputs w/o losing server privacy • Relaxed OPRF: No info about outputs of random fk, other than what follows from legitimate queries • Does not preclude learning partial info about k • Query set size bounded by t for t-time OPRFs • Indistinguishability: Outputs on unqueried inputs cannot be distinguished from outputs of random function
Other results: constructions of OPRF • OPRF based on non-black-box OT [Y,GMW] • OPRF based on specific assumptions [NP] • E.g., based on DDH or factoring • Fully adaptive • Quite efficient • OPRF based on black-box OT • Uses relaxed definition of OPRF • Good for up to t adaptive queries
OPRF based on DDH [“scaled up” NP] • The Naor-Reingold PRF: • Key k = [ a1, …, aL ] • Input x = x1 x2 x3 … xL • Pseudorandom based on DDH • OPRF based on PRF + OT • Server: [ a1, …, aL ], [ r1, …, rL ] • Client: x = x1 x2 x3 … xL • L OT’s: ri if xi = 0, ai ri otherwise r1 a1r1 OT12 x1=0 r2 a2r2 OT12 x2=1 rL aLrL OT12 xL=1 g^(1 / r1r2…rL)
Relaxed OPRF based on OT a0,2t • Server key: L x 2t matrix • Client input: x = { x1, x2, … , xL} • Client gets L keys using OT12t • After t calls, learns t Lkeys a0,0 aL,0 • Map inputs to locations in L-dimensions using a (t+2)-wise independent, secret mapping h • Client first obliviously computes h(x), then F(h(x)) • Learns t of 2t keys in L dimensions • Probability that other value uses these keys is (1/2)L
Conclusions • Keyword search is an important tool • We show: • Efficient constructions based on OPE • Generic reduction to OPRF + semi-private KS • Fully-adaptive based on DDH • Black-box reduction via OT, yet only good for t invoc’s • Open problem: • Black-box reduction to OT good for poly invoc’s?