460 likes | 887 Views
Private Information Retrieval. Yuval Ishai, Technion. Private Information Retrieval (PIR) [CGKS95]. Goal: allow user to query database while hiding the identity of the data-items she is after. Motivation: patent databases, web searches, etc.
E N D
Private Information Retrieval Yuval Ishai, Technion
Private Information Retrieval (PIR)[CGKS95] • Goal: allow user to query database while hiding the identity of the data-items she is after. • Motivation: patent databases, web searches, etc. • Paradox(?): imagine buying in a store without the seller knowing what you buy. (Encrypting requests is useful against third parties; not against owner of data.)
Modeling • Database:n-bit string x n should be thought of as being large • User: wishes to • retrieve xi and • keepi private
Server ??? xi User
Some “solutions” • User downloads entire database. Drawback:n communication bits (vs. logn+1 w/o privacy). Main research goal: minimize communication complexity. 2. User masks i with additional random indices. Drawback: gives a lot of information about i. 3. Enable anonymous access to database.Note: addresses a different concern: hides identity of user, not the fact that xi is retrieved. Fact: PIR as described so far requires (n) communication bits.
Two Approaches Computational PIR[CG97,KO97,CMS99,...] Computational privacy,based oncryptographic assumptions. Information-Theoretic PIR[CGKS95,Amb97,...] Replicate database among k servers. Unconditional privacy against tservers. Default: t=1
b b’ b+b’ = Computational PIR for Dummies Tool: homomorphic encryption Protocol: • User sends E(ej) • E(0) E(0) E(1) E(0) (=c1 c2 c3 c4) • Server replies with E(X·ej) • c2c3 • c1 c2c3 • c1c2 • c4 • User recoversjth column of X n1/2 0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1 n1/2 X= j PIR with ~ O(n1/2) communication
j j q1 q2 a1=X·q1 a2=X·q2 n1/2 q1,q2Z2 q1+ q2 = ej a1+a2=X·ej U Information-Theoretic PIR for Dummies n1/2 X S1 S2 n1/2 2-server PIR with O(n1/2) communication
Necessary assumptions: Nontrivial 1-server PIR implies OWF[BIKM99] … and even OT[DMO00] Bounds for Computational PIR servers comm. assumption [CG97] 2 O(n)one-way function [KO97] 1 O(n)QRA / [CMS99] 1 polylog(n) -hiding [KO00] 1 n-o(n) trapdoor permutation homomorphic encryption
Upper bounds: O(log n / loglog n)servers, polylog(n)[BF90,BFKR91,CGKS95] 2 servers, O(n1/3);k servers, O(n1/k)[CGKS95] k servers, O(n1/(2k-1)) [Amb97,Ito99, IK99, BI01]; t-private,O(n t/(2k-1)) [BI01] Today: k servers, O(ncloglogk /(klogk))[BIKR02]. Better for k3. Lower bounds: log n +1 (no privacy) 2 servers, ~4log n; k servers, ck log n[Man98,KT00] Restricted lower bounds: 2 servers, 1 round, 1-bit answers: Linear: 2n[CGKS95] Nonlinear: (n)[KW03, BFG02] 2 servers, 1 round, linear, user reads m bits: (n1/(m+1)) [GKST02] (more generally, |q|=(n/|a|m)). Bounds for i.t. PIR
Cons: Requires multiple servers Privacy against limited collusions only Worse asymptotic complexity (with const. k):O(nc) vs. no(1) or even polylog(n) Why Information-Theoretic PIR? Pros: • Neat question • Unconditional privacy • Better “practical” efficiency • Allows very short queries or very short answers (+apps)[DIO98,BIM99] • Essentially equivalent to a natural coding question [KT00]
i k-server PIR k-query LDC: y = answers to all possible queries on database x Code length = 2PIR_Query_Length Alphabet = {0,1}PIR_Answer_Length #queries = k Binary LDC PIR with one answer bit per server Locally Decodable Codes [KT00] x y
PIR-Related Work • Extensions • Symmetric PIR [GIKM98,NP99] • Private storage [OS98] • Retrieval by keywords [CGN99] • Additional settings for PIR [GGM98,DIO98,BIM00,...] • PIR as a building-block • Private statistical queries [CIKRRW01] • Private approximations[FIMNSW01] • Communication-preserving secure function evaluation [NN01]
Time Complexity of PIR Focus so far: communication complexity Obstacle:time complexity • Protocols require (at least)linear computation per query. • Thm: servers must spend at least linear expected time. Workarounds: • PIR with preprocessing [BIM00] • some savings possible in multi-server case • single-server case seems hopeless • Amortizing computation over multiple queries • single-user case essentially solved [IKOS02] • multi-user case open
Open Questions • Communication complexity of i.t. PIR: • 2 servers • 3 servers, binary answers ( 3-query binary LDC) • t>1 • Sufficient assumptions for 1-server PIR • OT nontrivial PIR ? • Trapdoor permutation good PIR? • Your favorite assumption great PIR? • Time complexity of PIR • Better bounds for PIR with preprocessing • Better amortization in a multi-user setting • Does 1-server PIR require many “public-key” operations?
Breaking the O(n1/(2k-1)) Barrier forInformation-Theoretic PIR Joint work with A. Beimel, E. Kushilevitz, J.F. Raymond
k = 2 k = 3 k = 4 k = 5 k = 6
??? ??? ??? xi Model X X X S1 S2 Sk User i
P P P S1 S2 Sk ??? ??? ??? P(z) User z Arithmetization xPx F[Z1,…,Zm] i zi Fm i[n], Px(zi) = xi
Field F = GF(2) Degree d = const. #vars ms.t. m= O(n1/d) suffices Ex.d=3, m=8, n= zn=00000111 z1=11100000 z2=11010000…. M2= Z1Z2Z4 M1= Z1Z2Z3 Mn= Z6Z7Z8 Parameters
size O(n1/c) degree d/c, m variables Effect of Degree Reduction size n degree d, m variables
PP QQ Q P S1 S1 S2 S2 Sk Sk P(z) Q(y) User User y z Each entry of y is known to all but one server. Degree Reduction Using Partial Information [BKL95,BI01] z is hidden from servers
k=3,d=6 S1 S2 S3 Q = + + + + S1 S3 S1 S2 S3 Q3 Q2 Q1 Q(y)=Q1(y)+Q2(y)+Q3(y) degQj d/k = 2
P(z) Q(y) Back to PIR Q Q Q P P P S1 S2 Sk S1 S2 Sk User User y z Each entry of y is known to all but one server. O(n1/k) comm. bits z is hidden from servers n comm. bits Let z=y1+…+ yk , where the yj are otherwise random Q(Y1,…,Yk)= P(Y1+… +Yk)
Initial Protocol O(m) = O(n1/d) • User picks random y1,…, yk s.t. y1+…+ yk= z, and sends to Sj all y’s exceptyj. • Servers define an mk-variate degree-dpolynomial Q(Y1,…,Yk)= P(Y1+… +Yk). • Each Sj computes degree-(d/k) poly. Qj , such that Q(y)= Q1(y)+…+Qk(y). • Sj sends a descriptionof Qj to User. • User computes Qj(y)=xi . O(n1/k)
Useful parameters: • d=k-1 query length O(n1/(k-1))d/k =0 answer length 1 • d=2k-1 query length O(n1/(2k-1))d/k =1 answer length O(n1/(2k-1)) Best previous binary PIR Best previous PIR A Closer Look • M Sjmissing at most d/k variables. deg Qj d/k
k=3,d=6, k’=2 S1 S2 S3 Q = + + + + å = Q ( y ) Q ( y ) V = | V | k ' S1S2 S2S3 S1S2 S1S2 S1S3 Boosting the Integer Truncation Effect • Idea: apply multiple “partial” degree reduction steps. • Generalized degree reduction:Assign each monomialto the k’servers V which jointly miss the least number of variables.
replication degree size kdn k’d’n’=O(n d’/d) k”d”n”=O(n’d”/d’) ……… reduction reduction reduction The missing operator: kdn k d’n #vars m m’=O(md/d’) conversion Additional cost: re-distribute new point y’ 1d/knd/k /d
reduction 2 4 O(n1/7) O(n4/7) conversion 2 3 O(n4/21) O(n4/7) reduction 1 1O(n4/21)O(n4/21) Queries Answers O(n4/21)communication Example: k=3 replication degree #vars size 3 7 O(n1/7) n
In Search of the Missing Operator P P’ d’, m’ d, m P(y)=P’(y’) y y’ • Must have m’=(md/d’). • Question: For which d’<d can get m’=O(md/d’)? • Possible when d’|d . • Open otherwise. Positive result PIR with complexity nO(1/k^2) • Simplest open case: d=3, d’=2, m’=O(m3/2)
A Workaround • Can’t solve the polynomial conversion problem in general. • … but easy to solve given the promise weight(y)=const. • Stronger degree reduction: • Main technical lemma: good parameters for strong degree reduction.
Open Problems • Better upper bounds • What is the true limit of our technique? • Tight bounds for polynomial conversion • The case t>1 • Lower bounds