1 / 21

Generalizing PIR for Practical Private Retrieval of Public Data

Generalizing PIR for Practical Private Retrieval of Public Data. Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara. DBSec 2010. Outline. The Problem Practical private retrieval of public data Main Challenges

kamea
Download Presentation

Generalizing PIR for Practical Private Retrieval of Public Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generalizing PIR for Practical Private Retrieval of Public Data Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010

  2. Outline • The Problem • Practical private retrieval of public data • Main Challenges • Strong privacy, practical cost of retrieval • Our proposal • Absolute privacy in a bounding box • Contributions • Private retrieval service charge model • Bounding-box PIR: generalizing k-Anonymity and PIR • Query bykey in one round S.Wang, D.Agrawal and A.El Abbadi

  3. Untrusty server Server Client public data query Problem: Private Retrieval of Public Data I don’t want to reveal my personal interest. obfuscated query Private query method Private data profile I can provide this private retrieval service, if you pay for it. S.Wang, D.Agrawal and A.El Abbadi

  4. Desiderata & Challenge • Desiderata • Practical • Minimize computation and communication costs • Flexible • Allow clients to specify their desired degree of privacyρ and service charge budget µ. Satisfy ρ without exceeding µ. • Metrics of interests • Performance metrics • Computation Cost Ccomp • Communication Cost Ccomm • Quality of service metrics • Privacy Breach Probability Pbrh (Pbrh ≤ ρ) • Server Charge Csrv (Csrv ≤µ) • Challenge • Difficult to achieve both strong privacy and practical retrieval cost at the same time S.Wang, D.Agrawal and A.El Abbadi

  5. Candidate Solution I: k-Anonymity • Principle • Blur a data value with a range or partition s.t. each value is indistinguishable among at least k values. [Sama98, Swee02] • Analysis: use k bit data to anonymize 1 requested bit • E.g. k =30, query “June 17, 1972” -> obfuscated query “June, 1972” • Ccomp = k, Ccomm = k +1 • Pbrh = 1/k, Csrv = k • Pros • Flexible • Computationally cheap • Cons • Potential proximity breach for numeric data (due to a narrow anonymous range) [Li08] • Plain text communication, subject to attack with background knowledge S.Wang, D.Agrawal and A.El Abbadi

  6. Client Server q=“give me ith record” public data encrypted(q) Xi encrypted-result=f(X, encrypted(q)) Candidate Solution II: Computational Private Information Retrieval (cPIR) • Principle • Achieve computationally complete privacy by applying cryptographic computations over the entire public data [Kush97] • Pros • Complete privacy for clients • Secure communication • Cons • Orders of magnitude less efficient than simply transferring the entire data from the server to the client [Sion07] X= S.Wang, D.Agrawal and A.El Abbadi

  7. cPIR Theoretical Background • Quadratic Residue (QR) • x is aquadratic residue (QR) mod N if • E.g. N=35, 11 is QR (92=11 mod 35), 3 is QNR (no y exists for y2=3 mod 35) • Essential properties: • QR ×QR = QR • QR ×QNR = QNR • Let N =p1×p2, p1 and p2 are large primes of m/2 bits. • Quadratic ResiduosityAssumption (QRA) • Determining if a number is a QR or a QNR is computationally hard if p1 and p2 are not given.

  8. e 0 1 1 0 1 1 1 0 0 1 1 0 0 1 1 1 g Example of cPIR public data size: n = 16 Organize data in an s×t (4×4) binary matrix M M2,3 4 16 17 11 17 33 17 27 z4 z3 z2 z1 Get M2,3 QNR e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} z2=QNR => M2,3=1 z2=QR => M2,3=0 Adapted from Tan’s presentation S.Wang, D.Agrawal and A.El Abbadi

  9. Our Proposal: Bounding-Box PIR (bbPIR) • Principles • Rely on cPIR cryptographic operations to achieve strong privacy • Trade partial privacy of cPIR for practical performance • Adopt the flexible privacy principle of k-Anonymity • Basic idea • Bound expensive cryptographic computations in an r×cbounding boxBB, a sub-matrix on M. • (1) Satisfy client’s privacy requirement: r×c = 1/ρ • (2) Minimize Ccomm -> minimize(c + b×r) • Properties • The bounding box contains both the data whose values are close to the query value and the data whose values are not close. • Unify k-Anonymity and cPIR by varying dimensions of the bounding box S.Wang, D.Agrawal and A.El Abbadi

  10. 0 1 1 0 1 1 1 0 y: 0 1 1 0 16 17 0 1 1 1 QNR Example of bbPIR e M2,3 z: Get M2,3 17 27 e=2, g=3, N=35, m=6 QNR={3,12,13,17,27,33} QR={1,4,9,11,16,29} g BB z2=QNR => M2,3=1 S.Wang, D.Agrawal and A.El Abbadi

  11. e e e Bounding box cPIR bbPIR k-Anonymity g g g Example Comparisons of cPIR, bbPIR and k-Anonymity Public data size: n = 16 Query: retrieve the item with key 53 Ccomp = k = 4 Ccomm = k +1 = 5 Pbrh = 1/ k = ¼ Csrv = k = 4 S.Wang, D.Agrawal and A.El Abbadi

  12. Query by Address -> Query by Key • Limitation of previous formulation: query by matrixaddress • Solution for query by key: find address by key • Candidate solution I: third party translation, like in Casper [Mokb07] • Cons: security subject to a third party • Candidate solution II: an index structure on server mapping key to address [Chor97] • Cons: needs O(b× logn) times communication • Our proposal: server publishes a histogramH on the key field to authorized clients. • Client calculates an address range for the queried entry by searching the bin in which the entry falls. • Pros: If the bin size w ≤ s, only need to run one round of bbPIR S.Wang, D.Agrawal and A.El Abbadi

  13. Example of Query by Key • In clients’ view, server matrix M is a histogram matrix HM, thus the address of the requested item x maps to an address range of the items in the same bin with x. HM1,3 (M1,3, M2,3) M2,3 e e 13 40 70 93 13 -- 7 40 -- 26 70 -- 54 93 -- 80 138 -- 101 138 8 33 60 89 107 w=2 7 26 54 80 101 5 -- 1 23 -- 16 53 -- 45 79 -- 72 100 -- 94 5 23 53 79 100 1 16 45 72 94 g g S.Wang, D.Agrawal and A.El Abbadi

  14. Experiment Setup • Implementation of three private retrieval methods • bbPIR, cPIR • k-Anonymity: anonymize the private query item by specifying a consecutive range that covers the item • Data set • Generated n=106 data records with 3 attributes based on an Adult census data set with 32561 records of 15 attributes. • Only for experiment on proximity privacy of numeric data, generated 106 numeric data following Zipf distribution in [0.0, 1.0]. • Settings • Test bed: Intel 2.40GHz CPU, 3GB memory, Federal Core 8 OS • Default parameter values: ρ = 0.001, µ = 50, k = 1000, m = 1024 S.Wang, D.Agrawal and A.El Abbadi

  15. Experiment Result:Varying Modulus Bit Size m S.Wang, D.Agrawal and A.El Abbadi

  16. Experiment Result:Varying Privacy Degree ρ S.Wang, D.Agrawal and A.El Abbadi

  17. Experiment Result:Varying Charge Limit µ S.Wang, D.Agrawal and A.El Abbadi

  18. Experiment Result:Proximity Privacy of Numeric Data S.Wang, D.Agrawal and A.El Abbadi

  19. Experiment Result: Varying Histogram Bin Size w for Query by Key S.Wang, D.Agrawal and A.El Abbadi

  20. Conclusion • We proposed a practical, flexible and secure approach for private retrieval of public data in single server settings, called Bounding-Box PIR (bbPIR). • bbPIR generalizes cPIR and k-Anonymity based private retrieval methods. • We incorporated the realistic assumption of charging clients for the exposed service data. • We achieved query by key without running additional rounds of bbPIR. S.Wang, D.Agrawal and A.El Abbadi

  21. References • [Sama98] P. Samarati et al. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, 1998. • [Swee02] L. Sweeney. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5):557--570, 2002. • [Li08] J. Li et al. Preservation of proximity privacy in publishing numerical sensitive data. In SIGMOD 2008. • [Mokb07] M. Mokbel et al. The new casper: A privacy-aware location-based database server. In ICDE 2007. • [Kush97] E. Kushilevitz et al. Replication is not needed: Single database, computationally-private information retrieval. In FOCS 1997. • [Sion07] R. Sion et al. On the computational practicality of private information retrieval. In NDSS 2007. • [Chor97] B. Chor et al. Private information retrieval by keywords. Technical Report, TRCS 0917, Technian. S.Wang, D.Agrawal and A.El Abbadi

More Related