Keyword Search and Oblivious Pseudo-Random Functions

Keyword Search and Oblivious Pseudo-Random Functions Mike Freedman NYU Yuval Ishai, Benny Pinkas, Omer Reingold

X1 X2 X3 X4 … Xn Background: Oblivious Transfer • Oblivious Transfer (OT) [R], 1-out-of-N [EGL]: • Input: • Server: x1,x2,…,xn • Client: 1 ≤ j ≤ n • Output: • Server: nothing • Client: xj • Privacy: • Server learns nothing about j • Client learns nothing about xi for i ≠ j • 4 • Well-studied, good solutions: O(n) overhead j Xj

Background: Private Information Retrieval (PIR) • Private Information Retrieval (PIR) [CGKS,KO] • Client hides which element retrieved • Client can learn more than a single xj • o(N) communication, O(N) computation • Symmetric Private Information Retrieval (SPIR) [GIKM,NP] • PIR in which client learns onlyxj • Hence, privacy for both client and server • “OT with sublinear communication”

Motivation: Sometimes, OT is not enough • Bob (“Application Service Provider”) • Advises merchants on credit card fraud • Keeps list of fraudulent card numbers • Alice (“Merchant”) • Received a credit card, wants to check if fraudulent • Wants to hide credit-card details from Bob, vice-versa • Use OT? • Table of 1016 ≈ 253 entries, 1 if fraudulent, 0 otherwise?

(x1,p1) (x2,p2) … (xn,pn ) Server: Client: w Client output: (xj ,pj ) iff w=xj Keyword Search (KS): definition • Input: • Server: database X={ (xi,pi ) } , 1 ≤ i ≤ N • xiis a keyword(e.g. number of a corrupt card) • pi is the payload(e.g. why card is corrupt) • Client:search wordw(e.g. credit card number) • Output: • Server: nothing • Client: • piif  i : xi = w • otherwise nothing

Keyword Search from data structures? [KO,CGN] • Take any efficient query-able data structure • Hash table, search tree, trie, etc. • Replace direct query with OT / PIR • Achieves client privacy We’re done?

… Keyword Search from hashing + OT [KO] • Use hash function H tomap (xi,pi) to bin H(xi) • Client uses OT to read bin H(w) • Multiple per bin: no server privacy: client gets > 1 elt • One per bin, N bins: no server privacy: H leaks info • One per bin, >> N bins: not efficient

Keyword Search • Variants • Multiple queries • Adaptive queries • Allowing setup • Malicious parties • Prior Work • OT + Hashing = KS without server privacy [KO] • Add server privacy using trie and many rounds [CGN] • Adaptive KS [OK] • But, setup with linear communication, RO model, one-more-RSA-inversion assumption

Keyword Search: Results • Specific protocols for KS • One-time KS based on OPE (homomorphic encryption) • First 1-round KS with sublinear communication • Adaptive KS by generic reduction • Semi-private KS + oblivious PRFs • New notions and constructions of OPRFs • Fully-adaptive (DDH- or factoring-specific) • T-time adaptive (black-box use of OT)

Keyword Search based on Oblivious Evaluation of Polynomials

C S w P(x) = Σaixi P(w) Specific KS protocols using polynomials • Tool: Oblivious Polynomial Evaluation (OPE) [NP] • Privacy: Server: nothing about w. Client: nothing but P(w)

1-round KS protocol using polynomials • OPE implementation based on homomorphic encryption • Given E(x), E(y), can compute E(x+y), E(c·x), w/o secret key • Server defines on input X={(xi,pi )}, • Z(x) = r · P(x) + Q(x) , with fresh random r xi If xi  X : 0 + pi |0k If xi X : rand • Client/server run OPE of Z(w), overhead O(N) • C sends: E(w), E(w2), …, E(wd), PK • S returns: E(r·Σpi w i + Σqi w i ) = E(r·P(w) + Q(w)) = E(Z(w))

Reducing the overhead using hashing… • Client sends input for L OPE’s of degree m • Server has E(Z1(w)), … ,E(ZL(w)) • Client uses PIR to obtain OPE output from bin H(w) • Comm: O(m = log N) + PIR overhead (polylog N) • Comp: O(N) server, O(m = log N) client publichash function Hm independentof X … L bins Z2 Z1 Fresh random r for Zj(x) = r · Pj(x) + Qj(x) m

What about malicious parties? • Efficient 1 round protocol for non-adaptive KS • Only consider privacy: server need not commit or know DB • Similar relaxation used before in like contexts (PIR, OT) • Privacy against a malicious server? • Server only sees client’s interaction in an OT / PIR protocol • Malicious clients? • Message in OPE might not correspond to polynomial values • Can enforce correct behavior with about same overhead • 1 OPE of degree-m polynomials → m OPEs of linear poly’s

Keyword Search based on Oblivious Evaluation of Pseudo-Random Functions

Semi-Private Keyword Search • Goal: Obtain KS from semi-private KS + OPRF • Semi-Private Keyword Search (PERKY [CGN]) • Provides privacy for client but not for server • Similar privacy to that of PIR • Examples • Send database to client: O(N) communication • Hash-based solutions + PIR to obtain bin • Use any fancy data structure + PIR to query

C S x k Fk(x) Oblivious Evaluation of Pseudo-Random Functions • Pseudo-Random Function: Fk : {0,1}n {0,1}n • Keyed by k (chooses a specific instantiation of F ) • Without k, the output of Fk cannot be distinguished from that of a random function • Oblivious evaluation of a PRF (OPRF) • Client: PRF output, nothing about k • Server: Nothing

key, payload masked by PRF KS from Semi-Private KS + OPRF  (xi, pi )  X, Let x’i | p’i ← Fk(xi ) Let( xi, pi )← (x’i , pi  p’i) S chooses k, defining Fk(·) • Client • Uses OPRF to compute x’ | p’ ← Fk(w) • Uses semi-private KS to obtain ( xi, pi )where xi = x’ • If entry in database, recovers pi = pi p’

key, payload masked by PRF KS from Semi-Private KS + OPRF  (xi, pi )  X, Let x’i | p’i ← Fk(xi ) Let( xi, pi )← (x’i , pi  p’i) S chooses k, defining Fk(·) • Security • Preserved even if client obtains all pseudo-database… • Requires that client can’t determine output of OPRF other than at inputs from legitimate queries

Weaker OPRF definition suffices for KS • Strong OPRF: Secure 2PC of PRF functionality • No info leaked about key k for arbitrary fk, other than what follows from legitimate queries • Same OPRF on multiple inputs w/o losing server privacy • Relaxed OPRF: No info about outputs of random fk, other than what follows from legitimate queries • Does not preclude learning partial info about k • Query set size bounded by t for t-time OPRFs • Indistinguishability: Outputs on unqueried inputs cannot be distinguished from outputs of random function

Other results: constructions of OPRF • OPRF based on non-black-box OT [Y,GMW] • OPRF based on specific assumptions [NP] • E.g., based on DDH or factoring • Fully adaptive • Quite efficient • OPRF based on black-box OT • Uses relaxed definition of OPRF • Good for up to t adaptive queries

OPRF based on DDH [“scaled up” NP] • The Naor-Reingold PRF: • Key k = [ a1, …, aL ] • Input x = x1 x2 x3 … xL • Pseudorandom based on DDH • OPRF based on PRF + OT • Server: [ a1, …, aL ], [ r1, …, rL ] • Client: x = x1 x2 x3 … xL • L OT’s: ri if xi = 0, ai ri otherwise r1 a1r1 OT12 x1=0 r2 a2r2 OT12 x2=1 rL aLrL OT12 xL=1 g^(1 / r1r2…rL)

Relaxed OPRF based on OT a0,2t • Server key: L x 2t matrix • Client input: x = { x1, x2, … , xL} • Client gets L keys using OT12t • After t calls, learns t Lkeys a0,0 aL,0 • Map inputs to locations in L-dimensions using a (t+2)-wise independent, secret mapping h • Client first obliviously computes h(x), then F(h(x)) • Learns t of 2t keys in L dimensions • Probability that other value uses these keys is (1/2)L

Conclusions • Keyword search is an important tool • We show: • Efficient constructions based on OPE • Generic reduction to OPRF + semi-private KS • Fully-adaptive based on DDH • Black-box reduction via OT, yet only good for t invoc’s • Open problem: • Black-box reduction to OT good for poly invoc’s?

Thanks….

Keyword Search and Oblivious Pseudo-Random Functions

Keyword Search and Oblivious Pseudo-Random Functions

Presentation Transcript

“Trick” for Keyword Search

Pseudo Random and Random Numbers

Pseudo-Random-Number-Generators Security Perspective

Pseudo-random Number Generation

Random Search Methods

15 bit Pseudo Random Bit Generator

Random/Exhaustive Search

Linux Pseudo random Number Generator (LPRNG)

Pseudo-Random Number Generation

Weighted Random Oblivious Routing on Torus Networks

Functions of Random Variables

PKCS #14: Pseudo-Random Number Generation

CHAPTER 16: KEYWORD SEARCH

Pseudo Random Numbers

HyKSS: Hybrid Keyword and Semantic Search

Functions of Random Variables

Oblivious Search Trees

Pseudo-Random Noise Radar

XML Keyword Search Refinement

Functions of Random Variables

Keyword Search and Keyword Selection

Functions of Random Variables