310 likes | 325 Views
Dive into the world of sublinear secure computation on big data, exploring hard tasks, benchmark metrics, PIR protocols, and functional limitations. Discover how to identify complex computational problems with inherent difficulty and efficiency metrics.
E N D
CRYPTO 2018 Limits of Practical Sublinear Secure Computation Elette Boyle, IDC Herzliya Yuval Ishai, Technion Antigoni Polychroniadou, Cornell Tech
Secure Two-Party Computation x1 y2 f(x1, x2) = (y1, y2) y1 x2 • Goal: • Correctness: Everyone computes f(x1,x2) • Security: Nothing else but the output is revealed Adversary Semi-honest
The age of Big Data Secure Computation on Big Data EXAMPLE EXPLANATION JOURNEY THIS IS GREAT – CHECK THIS OUT WANT IT TO BE TECHNOLOGY GEOMETRY TECHY COMPUTER GADGET EXAMPLE EXPLANATION JOURNEY THIS IS GREAT WANT IT TO BE TECHNOLOGY GEOMETRY
Secure Computation on BIG DATA Efficiency Metrics Where n is the # of bits in the database Almost all protocols with sublinear communication complexity suffer in computational complexity (e.g. FHE\PIR-based protocols)
Sublinear Communication 2PC Sublinear computation Linear computation [Chor-Goldreich-Kushilevitz-Sudan'95, Kushilevitz-Ostrovsky'97] PIR MST FHE [Gentry09] Median Convex Hull Single source shortest distances Approximate Set cover All pairs shortest distance [Aggarwal-Mishra-Pinkas04,Brickell-Shmatikov05,Shelat-Venkitasubramaniam15]
Motivation Which functions can be securely computed with sublinear overhead? Secure Computation on Big Data
Our Results • Provide framework for identifying “provably hard” sublinear secure computation tasks on big data. • Provide formal reductions • showing that many natural • problems are inherently “hard”. • (Including variants of the • problems in [AMP'04,BS'05,SV'15]) • (Akin to NP-hardness) • Define intermediate hardness • to capture natural problems that are • neither “hard” or “easy”. HARD EASY
Types of Functionalities Two-sided Functionalities One-sided Functionalities Secret-Shared output Functionalities Useful for MPC composition f(x1, x2) = (y,⊥) f(x1, x2) = ( [y] , [y] ) f(x1, x2) = ( [y] ,⊥) f(x1, x2) = (y, y) x1 x2 y y ⊥ [y] [y]
One-Sided Functionalities Sublinear computation Linear computation PIR One-sided Convex Hull, Median etc… FHE Secret-sharedConvex Hull, Median etc… Two-sided Convex Hull, Median, MST Single source shortest Distances, Approximate Set cover, All pairs shortest distance
One-Sided Functionalities Sublinear computation CORRECT INCORRECT! Linear computation PIR One-sided Convex Hull, Median etc… Are these variants of problems hard? FHE Secret-sharedConvex Hull, Median etc… TRUE Two-sided Convex Hull, Median, MST Single source shortest Distances, Approximate Set cover, All pairs shortest distance FALSE
Our Framework Benchmark metric for measuring computation complexity in the sublinear communication regime: PIR
Private Information Retrival (PIR) [Chor-Goldreich-Kushilevitz-Sudan'95,Kushilevitz-Ostrovsky'97] Request entry i Di Database D=D1D2...Dn • Goal: • Correctness: User obtains Di • Privacy: Server learns nothing about i
Private Information Retrival (PIR) [Chor-Goldreich-Kushilevitz-Sudan'95,Kushilevitz-Ostrovsky'97] “Hello, wake up” Return all the entries in D Database D=D1D2...Dn • Privacy is perfect but the overhead is prohibitively large. • Non-triviality requirement: • Communication cost must be in o(n)
1-server PIR State-of-the-art efficiency Where n is the # of bits in the database • Drawbacks: • PIR (without preprocessing) inherently requires linear computation. • Heavy public key operations. • slower than symmetric encryption by orders of magnitude • -XPIR, SealPIR • 1-server IT PIR is impossible • Even with preprocessing, sublinear-time PIR protocols are slow [BIM00, BIPW17, CHS17] PIRforms a computational barrier for 2PC on big data
Our Framework (PIR Hardness) PIR-hard any secure protocol for the problem implies nontrivial PIR on a large database. Problem is PIR-Hard when: EASY
Our Framework (PIR Hardness) PIR-hard A two-party functionality f with input size N is (n(N),1)-PIR-hard if there is a single-server PIR protocol on a database of size n(N) by making a single oracle call to f. EASY
One-sided Median is PIR-Hard Toy example D1 D0 If i=0: min Such that D0 < D1 Database D=D0D1 i∈ [n] Input phase: … … Output phase:
One-sided Median is PIR-Hard Toy example D1 D0 If i=0: min Such that D0 < D1 If i=1: max D0 D1 Database D=D0D1...Dn i∈ [n] Input phase: min max D1 D0 D1 D0 D1 D0 max min Output phase: D0 D1
One-sided Median Protocol is PIR-Hard Toy example D1 D0 If i=0: min Such that D0 < D1 If i=1: max D0 Database D=D0D1...Dn Fails for the 2-sided functionalities i∈ [n] Input phase: D1 D0 min Output phase: D0
PIR-Hard One-Sided Functionalities • Median • Convex Hull • Single source shortest Distances • Approximate Set cover • All pairs shortest distance Utilize combinatorial notion of VC-dimension [Vapnik,Chervonenkis71]
One-sided functionalities are PIR-Hard Recall the ‘easy’ two-sided functionalities: Two-sided Convex Hull, Median, MST Single source shortest Distances, Approximate Set cover, All pairs shortest distance Are all two-sided functionalities ‘easy’?
Two-sided Nearest Neighbor Problem (x,y) Input phase: (an,bn) (a0,b0) … Location (x,y) Output to both parties the nearest restaurant to (x,y)
Our Framework (Semi-PIR Hardness) PIR-hard i Di D=D1D2...Dn Semi-PIR • Semi-PIR: • Correctness: User obtains Di • Privacy: Server learns nothing about i only if Di=1. EASY
If Di=0: choose (ai,bi) on the circle Two-sided Nearest Neighbor is Semi-PIR hard If Di=1: choose (ai,bi) outside the circle Toy example (a0,b0) If i=3 then (x,y) (a3,b3) (a1,b1) c Database D=0101 i∈ [n] (a2,b2) Input phase: c (a3,b3) (a0,b0) … Location (x,y) If Di=1 output c and if Di=0 : output c and (ai,bi) Output to both parties the nearest restaurant to (x,y) Output phase:
Semi-PIR Hard Two-Sided Functionalities • Nearest Neighbor • Single Source Single Destination shortest path • Shortest list selection • Closest destination • ….?
Semi-PIR vs. PIR Semi-PIR PIR-hard • Semi-PIR is not PIR hard via 1 call. • Existence of polylogarithmic semi-PIR implies the existence of slightly sublinear PIR (via multiple adaptive calls to semi-PIR): • Reduction uses LDCs • polylogarithmic PIR from polylogarithmicsemi-PIR? • if ‘dream’ LDCs exist. * With constant query complexity and polynomial rate.
Polylogarithmic semi-PIR ⇒ weak PIR Via q-query LDCs and O(2q) adaptive calls Rand PIR Semi-PIR to Rand ½ PIR: Database Database D=D0D1...Dn Encode Database using LDCs … (i1,…,i5) PIR i∈ [n]
Conclusion • Introduce PIR-hardness for identifying “provably hard” sublinear secure computation tasks on big data. • Provide formal reductions • showing that many natural • problems are PIR-Hard. • (Including variants of the • problems in [AMP'04,BS'05,SV'15]) • (Akin to NP-hardness) • Introduce semi-PIR hardness HARD Semi-PIR EASY
Our Taxonomy Easy problems Semi-PIR hard problems PIR-hard problems PIR One-sided Convex Hull, Median etc… Two-sided Single Source Single Dest. shortest path, Nearest Neighbor, Shortest list selection, closest destination. FHE Secret-sharedConvex Hull, Median etc… Two-sided Convex Hull, Median, MST Single source shortest Distances, Approximate Set cover, All pairs shortest distance Two-sided local compressible MST, Median.
Future Directions • Hierarchy of hardness classes beyond PIR-hardness and • Semi-PIR-hardness? • -- somewhat HE-hardness? • Better understanding of the relation semi-PIR and PIR? • VC-dimension analogue that captures PIR and semi-PIR-hardness • for two-sided functionalities? • Multi-party functionalities?
PIR and VC-dimension [Vapnik,Chervonenkis71] [BIKO12]: exploit this relation to construction PIR protocols A one-sided functionality f is PIR-hard iff f has a certain efficiently computable VC-dimension. Easy: Low VC-dimension PIR-hard: High VC-dimension