400 likes | 530 Views
P rivate Information Retrieval. Yuval Ishai Computer Science Department Technion. Talk Overview. Intro to PIR Motivation and problem definition Toy examples State of the art Relation with other primitives Locally Decodable Codes (Oblivious Transfer, CRHF) Constructions Open problems.
E N D
Private Information Retrieval Yuval Ishai Computer Science Department Technion
Talk Overview • Intro to PIR • Motivation and problem definition • Toy examples • State of the art • Relation with other primitives • Locally Decodable Codes • (Oblivious Transfer, CRHF) • Constructions • Open problems
Private Information Retrieval (PIR)[CGKS95] • Goal: allow a user to access a database while hiding what she is after. • Motivation: patent databases, web searches, etc. • Paradox(?): imagine buying in a store without the seller knowing what you buy. Note: Encrypting requests is useful against third parties; not against server holding the data.
Modeling Server ??? xi User
Some “solutions” • User downloads entire database. Drawback:n communication bits (vs. logn+1 w/o privacy). Main research goal: minimize communication complexity. 2. User masks i with additional random indices. Drawback: gives a lot of information about i. 3. Enable anonymous access to database.Addresses a different concern: hides identity of user, not the fact that xi is retrieved. Fact: PIR as described so far requires (n) communication bits.
Two Approaches Computational PIR[KO97,CMS99,...] Computational privacy,based oncryptographic assumptions. Information-Theoretic PIR[CGKS95,Amb97,...] Replicate database among k servers. Unconditional privacy against tservers. Default: t=1
??? ??? ??? xi Model for I.T. PIR X X X S1 S2 Sk User i
i q1 q2 a1=X·q1 a2=X·q2 q1+ q2 = ei a1+a2=X·ei Information-Theoretic PIR for Dummies n1/2 X S1 S2 n1/2 i U 2-server PIR with O(n1/2) communication
a b a+b = Computational PIR for Dummies Tool: homomorphic encryption Protocol: • User sends E(ei) • E(0) E(0) E(1) E(0) (=c1 c2 c3 c4) • Server replies with E(X·ei) • c2c3 • c1 c2c3 • c1c2 • c4 • User recoversith column of X n1/2 0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1 n1/2 X= i PIR with ~ O(n1/2) communication
Bounds for Computational PIR servers comm. assumption [CG97] 2 O(n)one-way function [KO97] 1 O(n)QRA / [CMS99] 1 polylog(n) -hiding … DCRA [Lipmaa] [KO00] 1 n-o(n) trapdoor permutation homomorphic encryption
Upper bounds: O(log n / loglog n)servers, polylog(n)[BF90,BFKR91,CGKS95] 2 servers, O(n1/3);k servers, O(n1/k)[CGKS95] k servers, O(n1/(2k-1)) [Amb97,Ito99, IK99, BI01,WY05] t-private,O(nt/(2k-1)) [BI01,WY05] k servers, O(ncloglogk /(klogk))[BIKR02]. Lower bounds: log n +1 (no privacy) 2 servers, ~5log n; k servers, ck log n[Man98,WdW04] Better for restricted 2-server protocols [CGKS95, GKST02, BFG02, KdW03, WdW04] BIKR “dirty” AMB IK CGKS BI WY “clean” inefficient efficient Bounds for I.T. PIR
Cons: Requires multiple servers Privacy against limited collusions Worse asymptotic complexity (with const. k):O(nc) vs. polylog(n) Why Information-Theoretic PIR? Pros: • Neat question • Unconditional privacy • Better “real-life” efficiency • Allows very short queries or very short answers (+apps[DIO98,BIM99] ) • Closely related to a natural coding problem [KT00]
i Locally Decodable Codes [KT00] x y Requirements: • High fault-tolerance • Local decoding Recover from m faults… … with ½+ probability Question: how large should m(n) be in a k-query LDC? k=2: 2(n) k=3: 2O(n^ 0.5) (n2)
Converse relation also holds. Binary LDC PIR with one answer bit per server Best known LDC are obtained from PIR protocols. const. q: m=exp(nc loglogq / qlogq) From I.T. PIR to LDC k-server PIR with -bit queries and -bit answers k-query LDC of length 2over ={0,1} y[q]=Answer(x,q)
Ben-Or, Goldwasser, Wigderson, 1988 Chaum, Crépeau, Damgård, 1988 Information-theoretic MPC is feasible! Open question: k3 players can compute any function f of their inputs with total work= poly(circuit-size) Can kcomputationally unbounded players compute an arbitrary f with communication= poly(input-length)? … or with work = poly(formula-size) and constant rounds [BB89,…] Can this be done using a constant number of rounds? A Question about MPC Beaver, Micali, Rogaway, 1990 B., Feigenbaum, Kilian, R., 1990
= communication-efficient MPC = no communication-efficient MPC Question Reformulated Is the communication complexity of MPC strongly correlated with the computational complexity of the function being computed? All functions efficiently computable functions
1990 2000 1995 Connecting MPC and LDC [IK04] [KT00] MPC PIR LDC • The three problems are “essentially equivalent” • up to considerable deterioration of parameters
Homomorphic Encryption Trapdoor Permutation* OT CRHF* Secure Computation NI UH Commitment KA cPIR and the Crypto World cPIR
PIR as a Building Block • Private storage [OS98] • Sublinear-communication secure computation • 1-out-of-n Oblivious Transfer (SPIR) [GIKM98,NP99,…] • Keyword search [CGN99,FIPR05] • Statistical queries [CIKRRW01] • Approximate distance [FIMNSW01, IW04] • Communication-preserving secure function evaluation [NN01]
Non-adaptive Adaptive ? Single user Multiple users ? ? Time Complexity of PIR Focus so far: communication complexity Obstacle:time complexity • Server/s must spend at least linear time. Workarounds: • Preprocessing [BIM00] • Amortization [BIM00, IKOS04]
High level structure of all known protocols • User maps i into a point zFm • User secret-shares z between servers using some t-threshold LSSS over F • Server j responds with a linear function of x determined by its share of z. • Two types of protocols: • Polynomial-based[BF90,BFKR91,CGKS95,…,WY05] • LSSS = Shamir • Scale well with k,t • Replication-based[IK99,BI01,BIKR02] • LSSS = CNF • Do not scale well with k,t - involve (k choose t) replication overhead • However, dominate over polynomial-based up to (k choose t) factor [CDI05] • Best known protocols for constant k
Polynomial-Based Protocols • Step 1: Arithmetization • Fix a degree parameter d (will be determined by k) • Goal: Communication = O(n1/d) • User maps i[n] into a weight-d vector z of length m=O(n1/d). • 1 11100….0 2 11010…0 n 00…0111 • Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm • Privately retrieving i-th bit of x privately evaluating P on z.
t =1 Fm z+ z+2 z+3 z+4 z Basic Protocol: t=1 • Goal:user learns P(z) without revealing z. • Step 2: Secret Sharing of z • t=1: • Pick random “direction”Fm • zj = z+j goes to Sj • Step 3: Sj responds with P(zj) • User can extrapolate P(z) from P(z1),…,p(zk) if k>d. • Define deg-d univariatepoly Q(W)=P(z+W) • Q(0)=P(z) can be extrapolated from d+1 distinct values Q(j)=P(zj) • Query length m=O(n1/d), answer length 1 • Using k servers, O(n1/k-1) communication
Fm z Basic Protocol: General t • Goal:user learns P(z) without revealing z. • Step 2: Secret Sharing of z • General t: • zj = z+j1+j22+…+jtt • Step 3: Sj responds with P(zj) • User can extrapolate P(z) from P(z1),…,P(zk) if k>dt. • Define deg-dt univariatepoly Q(W)=P(z+W1+W22+…+Wtt) • P(z)=Q(0) can be extrapolated from dt+1 distinct values q(j) • O(n1/d) communication using k=dt+1 servers.
Improved Variant [WY05] • Goal: user learns P(z) without revealing z. • Step 2: Secret Sharing of z • General t: • zj = z+j1+j22+…+jtt • Step 3: Sj responds with P(zj)along with all m partial derivatives of P evaluated at zj • User can extrapolate P(z) if k>dt/2. • Define deg-dt univariatepoly Q(W)=P(z+W1+W22+…+Wtt) • P(z)=Q(0) can be extrapolated from 2k>dt distinct values Q(j),Q’(j) • Complexity: O(m) communication both ways • Same communication using half as many servers!
Breaking the O(n1/(2k-1)) Barrier [BIKR02] k = 2 k = 3 k = 4 k = 5 k = 6
Arithmetization • As before, except that now F=GF(2) • Fix a degree parameter d (will be determined by k) • Goal: Communication = O(n1/d) • User maps i[n] into a weight-d vector z of length m=O(n1/d). • 1 11100….0 2 11010…0 n 00…0111 • Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm • Privately retrieving i-th bit of x privately evaluating P on z.
size O(n1/c) degree d/c, m variables Effect of Degree Reduction size n degree d, m variables
PP QQ Q P S1 S1 S2 S2 Sk Sk P(z) Q(y) User User y z Each entry of y is known to all but one server. Degree Reduction Using Partial Information [BKL95,BI01] z is hidden from servers
k=3,d=6 S1 S2 S3 Q = + + + + S1 S3 S1 S2 S3 Q3 Q2 Q1 Q(y)=Q1(y)+Q2(y)+Q3(y) degQj d/k = 2
P(z) Q(y) Back to PIR Q Q Q P P P S1 S2 Sk S1 S2 Sk User User y z Each entry of y is known to all but one server. O(n1/k) comm. bits z is hidden from servers n comm. bits Let z=y1+…+ yk , where the yj are otherwise random Q(Y1,…,Yk)= P(Y1+… +Yk)
Initial Protocol O(m) = O(n1/d) • User picks random y1,…, yk s.t. y1+…+ yk= z, and sends to Sj all y’s exceptyj. • Servers define an mk-variate degree-dpolynomial Q(Y1,…,Yk)= P(Y1+… +Yk). • Each Sj computes degree-(d/k) poly. Qj , such that Q(y)= Q1(y)+…+Qk(y). • Sj sends a descriptionof Qj to User. • User computes Qj(y)=xi . O(n1/k)
Useful parameters: • d=k-1 query length O(n1/(k-1))d/k =0 answer length 1 • d=2k-1 query length O(n1/(2k-1))d/k =1 answer length O(n1/(2k-1)) Best previous binary PIR Best previous PIR A Closer Look • M Sjmissing at most d/k variables. deg Qj d/k
k=3,d=6, k’=2 S1 S2 S3 Q = + + + + å = Q ( y ) Q ( y ) V = | V | k ' S1S2 S2S3 S1S2 S1S2 S1S3 Boosting the Integer Truncation Effect • Idea: apply multiple “partial” degree reduction steps. • Generalized degree reduction:Assign each monomialto the k’servers V which jointly miss the least number of variables.
replication degree size kdn k’d’n’=O(n d’/d) k”d”n”=O(n’d”/d’) ……… reduction reduction reduction The missing operator: kdn k d’n #vars m m’=O(md/d’) conversion Additional cost: re-distribute new point y’ 1d/knd/k /d
reduction 2 4 O(n1/7) O(n4/7) conversion 2 3 O(n4/21) O(n4/7) reduction 1 1O(n4/21)O(n4/21) Queries Answers O(n4/21)communication Example: k=3 replication degree #vars size 3 7 O(n1/7) n
In Search of the Missing Operator P P’ d’, m’ d, m P(y)=P’(y’) y y’ • Must have m’=(md/d’). • Question: For which d’<d can get m’=O(md/d’)? • Possible when d’|d . • Open otherwise. Positive result Better PIR • Simplest open case: d=3, d’=2, m’=O(m3/2)
A Workaround • Can’t solve the polynomial conversion problem in general. • … but easy to solve given the promise weight(y)=const. • Stronger degree reduction: • Main technical lemma: good parameters for strong degree reduction.
Open Problems • Better upper bounds • Known:O(ncloglogk /(klogk)) • What is the true limit of our technique? • Generalize best upper bound to t>1 • Tight bounds for polynomial conversion • Lower bounds • Known:clogn • Simplest cases: • k=2 • k=3, single answer bit per server