1 / 40

P rivate Information Retrieval

P rivate Information Retrieval. Yuval Ishai Computer Science Department Technion. Talk Overview. Intro to PIR Motivation and problem definition Toy examples State of the art Relation with other primitives Locally Decodable Codes (Oblivious Transfer, CRHF) Constructions Open problems.

will
Download Presentation

P rivate Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Private Information Retrieval Yuval Ishai Computer Science Department Technion

  2. Talk Overview • Intro to PIR • Motivation and problem definition • Toy examples • State of the art • Relation with other primitives • Locally Decodable Codes • (Oblivious Transfer, CRHF) • Constructions • Open problems

  3. Private Information Retrieval (PIR)[CGKS95] • Goal: allow a user to access a database while hiding what she is after. • Motivation: patent databases, web searches, etc. • Paradox(?): imagine buying in a store without the seller knowing what you buy. Note: Encrypting requests is useful against third parties; not against server holding the data.

  4. Modeling Server ??? xi User

  5. Some “solutions” • User downloads entire database. Drawback:n communication bits (vs. logn+1 w/o privacy). Main research goal: minimize communication complexity. 2. User masks i with additional random indices. Drawback: gives a lot of information about i. 3. Enable anonymous access to database.Addresses a different concern: hides identity of user, not the fact that xi is retrieved. Fact: PIR as described so far requires (n) communication bits.

  6. Two Approaches Computational PIR[KO97,CMS99,...] Computational privacy,based oncryptographic assumptions. Information-Theoretic PIR[CGKS95,Amb97,...] Replicate database among k servers. Unconditional privacy against tservers. Default: t=1

  7. ??? ??? ??? xi Model for I.T. PIR X X X  S1 S2 Sk User i

  8. i q1 q2 a1=X·q1 a2=X·q2 q1+ q2 = ei a1+a2=X·ei Information-Theoretic PIR for Dummies n1/2 X S1 S2 n1/2 i U  2-server PIR with O(n1/2) communication

  9. a b a+b  = Computational PIR for Dummies Tool: homomorphic encryption Protocol: • User sends E(ei) • E(0) E(0) E(1) E(0) (=c1 c2 c3 c4) • Server replies with E(X·ei) • c2c3 • c1 c2c3 • c1c2 • c4 • User recoversith column of X n1/2 0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1 n1/2 X= i  PIR with ~ O(n1/2) communication

  10. Bounds for Computational PIR servers comm. assumption [CG97] 2 O(n)one-way function [KO97] 1 O(n)QRA / [CMS99] 1 polylog(n) -hiding … DCRA [Lipmaa] [KO00] 1 n-o(n) trapdoor permutation homomorphic encryption

  11. Upper bounds: O(log n / loglog n)servers, polylog(n)[BF90,BFKR91,CGKS95] 2 servers, O(n1/3);k servers, O(n1/k)[CGKS95] k servers, O(n1/(2k-1)) [Amb97,Ito99, IK99, BI01,WY05] t-private,O(nt/(2k-1)) [BI01,WY05] k servers, O(ncloglogk /(klogk))[BIKR02]. Lower bounds: log n +1 (no privacy) 2 servers, ~5log n; k servers, ck log n[Man98,WdW04] Better for restricted 2-server protocols [CGKS95, GKST02, BFG02, KdW03, WdW04] BIKR “dirty” AMB IK CGKS BI WY “clean” inefficient efficient Bounds for I.T. PIR

  12. Cons: Requires multiple servers Privacy against limited collusions Worse asymptotic complexity (with const. k):O(nc) vs. polylog(n) Why Information-Theoretic PIR? Pros: • Neat question • Unconditional privacy • Better “real-life” efficiency • Allows very short queries or very short answers (+apps[DIO98,BIM99] ) • Closely related to a natural coding problem [KT00]

  13. i Locally Decodable Codes [KT00] x y  Requirements: • High fault-tolerance • Local decoding Recover from m faults… … with ½+ probability Question: how large should m(n) be in a k-query LDC? k=2: 2(n) k=3: 2O(n^ 0.5) (n2)

  14. Converse relation also holds. Binary LDC  PIR with one answer bit per server Best known LDC are obtained from PIR protocols. const. q: m=exp(nc loglogq / qlogq) From I.T. PIR to LDC k-server PIR with -bit queries and -bit answers k-query LDC of length 2over ={0,1} y[q]=Answer(x,q)

  15. Ben-Or, Goldwasser, Wigderson, 1988 Chaum, Crépeau, Damgård, 1988 Information-theoretic MPC is feasible! Open question: k3 players can compute any function f of their inputs with total work= poly(circuit-size) Can kcomputationally unbounded players compute an arbitrary f with communication= poly(input-length)? … or with work = poly(formula-size) and constant rounds [BB89,…] Can this be done using a constant number of rounds? A Question about MPC Beaver, Micali, Rogaway, 1990 B., Feigenbaum, Kilian, R., 1990

  16. = communication-efficient MPC = no communication-efficient MPC Question Reformulated Is the communication complexity of MPC strongly correlated with the computational complexity of the function being computed? All functions efficiently computable functions

  17. 1990 2000 1995 Connecting MPC and LDC [IK04] [KT00] MPC PIR LDC • The three problems are “essentially equivalent” • up to considerable deterioration of parameters

  18. Homomorphic Encryption Trapdoor Permutation* OT CRHF* Secure Computation NI UH Commitment KA cPIR and the Crypto World cPIR

  19. PIR as a Building Block • Private storage [OS98] • Sublinear-communication secure computation • 1-out-of-n Oblivious Transfer (SPIR) [GIKM98,NP99,…] • Keyword search [CGN99,FIPR05] • Statistical queries [CIKRRW01] • Approximate distance [FIMNSW01, IW04] • Communication-preserving secure function evaluation [NN01]

  20. Non-adaptive Adaptive  ? Single user Multiple users ? ? Time Complexity of PIR Focus so far: communication complexity Obstacle:time complexity • Server/s must spend at least linear time. Workarounds: • Preprocessing [BIM00] • Amortization [BIM00, IKOS04]

  21. Protocols

  22. High level structure of all known protocols • User maps i into a point zFm • User secret-shares z between servers using some t-threshold LSSS over F • Server j responds with a linear function of x determined by its share of z. • Two types of protocols: • Polynomial-based[BF90,BFKR91,CGKS95,…,WY05] • LSSS = Shamir • Scale well with k,t • Replication-based[IK99,BI01,BIKR02] • LSSS = CNF • Do not scale well with k,t - involve (k choose t) replication overhead • However, dominate over polynomial-based up to (k choose t) factor [CDI05] • Best known protocols for constant k

  23. Polynomial-Based Protocols • Step 1: Arithmetization • Fix a degree parameter d (will be determined by k) • Goal: Communication = O(n1/d) • User maps i[n] into a weight-d vector z of length m=O(n1/d). • 1 11100….0 2  11010…0 n  00…0111 • Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm • Privately retrieving i-th bit of x privately evaluating P on z.

  24. t =1 Fm z+ z+2 z+3 z+4 z Basic Protocol: t=1 • Goal:user learns P(z) without revealing z. • Step 2: Secret Sharing of z • t=1: • Pick random “direction”Fm • zj = z+j goes to Sj • Step 3: Sj responds with P(zj) • User can extrapolate P(z) from P(z1),…,p(zk) if k>d. • Define deg-d univariatepoly Q(W)=P(z+W) • Q(0)=P(z) can be extrapolated from d+1 distinct values Q(j)=P(zj) • Query length m=O(n1/d), answer length 1 • Using k servers, O(n1/k-1) communication

  25. Fm z Basic Protocol: General t • Goal:user learns P(z) without revealing z. • Step 2: Secret Sharing of z • General t: • zj = z+j1+j22+…+jtt • Step 3: Sj responds with P(zj) • User can extrapolate P(z) from P(z1),…,P(zk) if k>dt. • Define deg-dt univariatepoly Q(W)=P(z+W1+W22+…+Wtt) • P(z)=Q(0) can be extrapolated from dt+1 distinct values q(j) • O(n1/d) communication using k=dt+1 servers.

  26. Improved Variant [WY05] • Goal: user learns P(z) without revealing z. • Step 2: Secret Sharing of z • General t: • zj = z+j1+j22+…+jtt • Step 3: Sj responds with P(zj)along with all m partial derivatives of P evaluated at zj • User can extrapolate P(z) if k>dt/2. • Define deg-dt univariatepoly Q(W)=P(z+W1+W22+…+Wtt) • P(z)=Q(0) can be extrapolated from 2k>dt distinct values Q(j),Q’(j) • Complexity: O(m) communication both ways • Same communication using half as many servers!

  27. Breaking the O(n1/(2k-1)) Barrier [BIKR02] k = 2 k = 3 k = 4 k = 5 k = 6

  28. Arithmetization • As before, except that now F=GF(2) • Fix a degree parameter d (will be determined by k) • Goal: Communication = O(n1/d) • User maps i[n] into a weight-d vector z of length m=O(n1/d). • 1 11100….0 2  11010…0 n  00…0111 • Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm • Privately retrieving i-th bit of x privately evaluating P on z.

  29. size O(n1/c) degree d/c, m variables Effect of Degree Reduction size n degree d, m variables

  30. PP QQ Q P   S1 S1 S2 S2 Sk Sk P(z) Q(y) User User y z Each entry of y is known to all but one server. Degree Reduction Using Partial Information [BKL95,BI01] z is hidden from servers

  31. k=3,d=6 S1 S2 S3 Q = + + + + S1 S3 S1 S2 S3 Q3 Q2 Q1 Q(y)=Q1(y)+Q2(y)+Q3(y) degQj d/k = 2

  32. P(z) Q(y) Back to PIR Q Q Q P P P S1 S2 Sk  S1 S2 Sk  User User y z Each entry of y is known to all but one server. O(n1/k) comm. bits z is hidden from servers n comm. bits Let z=y1+…+ yk , where the yj are otherwise random Q(Y1,…,Yk)= P(Y1+… +Yk)

  33. Initial Protocol O(m) = O(n1/d) • User picks random y1,…, yk s.t. y1+…+ yk= z, and sends to Sj all y’s exceptyj. • Servers define an mk-variate degree-dpolynomial Q(Y1,…,Yk)= P(Y1+… +Yk). • Each Sj computes degree-(d/k) poly. Qj , such that Q(y)= Q1(y)+…+Qk(y). • Sj sends a descriptionof Qj to User. • User computes Qj(y)=xi . O(n1/k)

  34.     Useful parameters: • d=k-1  query length O(n1/(k-1))d/k =0  answer length 1 • d=2k-1  query length O(n1/(2k-1))d/k =1  answer length O(n1/(2k-1)) Best previous binary PIR Best previous PIR A Closer Look • M  Sjmissing at most d/k variables.  deg Qj  d/k

  35. k=3,d=6, k’=2 S1 S2 S3 Q = + + + + å = Q ( y ) Q ( y ) V = | V | k ' S1S2 S2S3 S1S2 S1S2 S1S3 Boosting the Integer Truncation Effect • Idea: apply multiple “partial” degree reduction steps. • Generalized degree reduction:Assign each monomialto the k’servers V which jointly miss the least number of variables.

  36. replication degree size kdn k’d’n’=O(n d’/d) k”d”n”=O(n’d”/d’) ……… reduction reduction reduction The missing operator: kdn k d’n #vars m m’=O(md/d’) conversion Additional cost: re-distribute new point y’ 1d/knd/k /d

  37. reduction 2 4 O(n1/7) O(n4/7) conversion 2 3 O(n4/21) O(n4/7) reduction 1 1O(n4/21)O(n4/21) Queries Answers  O(n4/21)communication Example: k=3 replication degree #vars size 3 7 O(n1/7) n

  38. In Search of the Missing Operator P P’ d’, m’ d, m P(y)=P’(y’) y y’ • Must have m’=(md/d’). • Question: For which d’<d can get m’=O(md/d’)? • Possible when d’|d . • Open otherwise. Positive result  Better PIR • Simplest open case: d=3, d’=2, m’=O(m3/2)

  39. A Workaround • Can’t solve the polynomial conversion problem in general. • … but easy to solve given the promise weight(y)=const. • Stronger degree reduction: • Main technical lemma: good parameters for strong degree reduction.

  40. Open Problems • Better upper bounds • Known:O(ncloglogk /(klogk)) • What is the true limit of our technique? • Generalize best upper bound to t>1 • Tight bounds for polynomial conversion • Lower bounds • Known:clogn • Simplest cases: • k=2 • k=3, single answer bit per server

More Related