770 likes | 923 Views
P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy. Yitao Duan Berkeley Institute of Design UC Berkeley Qualifying Exam April 18, 2005. Outline. Problem and motivation Privacy issues examined Privacy is never a purely tech issue
E N D
P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy Yitao Duan Berkeley Institute of Design UC Berkeley Qualifying Exam April 18, 2005
Outline • Problem and motivation • Privacy issues examined • Privacy is never a purely tech issue • Derive some design principles • The P4P framework • Applications • Practical multiparty arithmetic computation with privacy • Service provision with privacy • Progress and future work
Applications and Motivation • “Next generation search” makes heavy use of personal data for customized search, context-awareness, expertise mining and collaborative filtering • E-commerce vendors (like Amazon) try to build user purchase profiles across markets. And user profiling is moving onto the desktop • Location based services, real-world monitoring …
Outline • Problem and motivation • Privacy issues examined • Privacy is never a purely tech issue • Derive some design principles • The P4P framework • Applications • Practical multiparty arithmetic computation with privacy • Service provision with privacy • Progress and future work
Legal Perspectives • Privacy issues arise as a tension between two parties: one seeks info about the other • Identity of the seeker leads to different situations and precedents • E.g. individual vs, the press, vs. the employer • Power imbalance between the two • Loss of privacy often leads to real harm: e.g. loss of job, loss of right, etc. [AK95]
Economic Perspectives • Market forces work against customer privacy • Company has to do extra work to get less info • Company can benefit from having user info • So they lack the incentive to adopt PETs • Power imbalance (again!) in e-commerce • But we, as users, can make a difference by flexing our collective muscles! • Users often underestimate the risk of privacy intrusion and are unwilling to pay for PET [FFSS02,ODL02, A04]
Social Science Perspectives • Privacy is NOT minimizing disclosure • Maintaining a degree of privacy often requires disclosure of personal information [Altman 75] • E.g. faculty members put “Perspective students please read this before you email me …” on their web page • Sociality requires free exchange of some information • PET should not prevent normal exchange
Lessons for Designing Practical Systems • Almost all problems are preserved, or even exaggerated in computing • Tension exists but court arbitration not available • Power imbalance prevails with no protection of the weak – client/server paradigm • Lack of incentive (to adopt PET, to cooperate, etc) • Design constraints for practical PET • Cost of privacy must be close to 0. And the privacy scheme must not conflict with the powerful actor’s need
Outline • Problem and motivation • Privacy issues examined • Privacy is never a purely tech issue • Derive some design principles • The P4P framework • Applications • Practical multiparty arithmetic computation with privacy • Service provision with privacy • Progress and future work
The P4P Philosophy You can’t wait for privacy to be granted. One has to fight for it.
P4P: Π2 Principles • Prevention: Not deterrence • Incentive: Design should consider the incentives of the participants • Protection: Design should incorporate mechanisms that protect the weak parties • Independence: The protection should be effective even if some parties do not cooperate
u Topologies S P2P Client-server
Problems With the Two Paradigms • Client-server • Power imbalance • Lack of incentive • P2P • Doesn’t always match all the transactions models (e.g. buying PCs from Dell) • Hides the heterogeneity • Many efficient server-based computation are too expensive if done P2P
The P4P Architecture Privacy Peer (PP) • A subset of users are elected as “privacy • providers” (called privacy peers) within the group • PPs provide privacy when they are available, but • can’t access data themselves
P4P Basics • Server is (almost) always available but PPs aren’t (but should be periodically) – asynchronous or semi-synchronous protocols • Server provides data archival, and synchronizes the protocol • Server only communicates with PPs occasionally (when they are online and light-loaded eg 2AM) • Server can often be trusted not to bias the computation – but we have means to verify it • PPs and all other user are completed untrusted
P2P: 70% of the users are free riding The Half-Full/Half-Empty Glass In a typical P2P system, 5% of the peers provide 70% of the services [GFS] • P4P: 5+% of the users are serving the community Enough for P4P to work practically!
Roles of the Privacy Peers • Anonymizing Communication • E.g. Anonymizer.com or Mix • Offloading the Server • Sharing Information • Participating in Computation • Others Infrastructure Support
Tools and Services • Cryptographic tools: Commitment, VSS, ZKP, Anonymous authentication, eCash, etc • Anonymous Message Routing • E.g. MIX network [CHAUM] • Data protection scheme [PET04] Λ: the set of users whom should have access to X • Anonymous SSL
Applications Practical Multiparty Arithmetic Computation with Privacy
Applications Multiparty Computation • n parties with private inputs wish to compute some joint function of their inputs • Must preserve security properties. E.g., privacy and correctness • Adversary: participants or external • Semi-honest: follows the protocol but curious • Malicious: can behave arbitrarily
Applications MPC – Known Results • Computational Setting: Trapdoor permutations • Any two-party function can be securely computed in the semi-honest model [Yao] • Any multiparty function can be securely computed in the malicious model, for any number of corrupted parties [GMW] • Info-Theoretic Setting: No complexity assumption • Any multiparty function can be securely computed in the malicious model if 2/3n honest parties [BGW,CCD] • With broadcast channel, only >1/2n honest parties[RB]
Applications A Solved Problem? • Boolean circuit based protocols totally impractical • Arithmetic better but still expensive: the best protocols have O(n3) complexity to deal with active adversary • Can’t be used directly in real systems with large scale: 103 ~ 106 users each with 103 ~ 106 data items
Applications Contributions to Practical MPC • P4P provides a setting where generic arithmetic MPC protocols can be run much more efficiently • Existing protocols (the best one): O(n3) complexity (malicious model) • P4P allows to reduce n without sacrificing security • Enables new protocols to make a whole class of computation practical
Applications Arithmetic: Homomorphism vs VSS • Homomorphism: E(a)E(b) = E(a+b) • Verifiable Secret Sharing (VSS): aa1, a2, … an • Addition easy • E(a)E(b) = E(a+b) • share(a) + share(b) = share(a+b) • Multiplication more involved for both • HOMO-MPC: O(n3) w/ big constant [CDN01, DN03] • VSS-MPC: O(n4) (e.g. [GRR98])
Applications Arithmetic: Homomorphism vs VSS • HOMO-MPC • + Can tolerate t < n corrupted players as far as privacy is concerned • Use public key crypto, 10,000x more expensive than normal arithmetic (even for addition) • Requires large fields (e.g. 1024 bit) • VSS-MPC • + Addition is essentially free • + Can use any size field • - Can’t tolerate t > n/2 corrupted players (can’t do two party multiplication)
Applications Bridging the Two Paradigms • HOMO-MPC VSS-MPC: • Inputs: c = E(a) (public) • Outputs: sharei(a) = DSKi(c) (private) • VSS-MPC HOMO-MPC: • Inputs: sharei(a) (private) • Outputs: c = П E(sharei(a)) (public) • A hybrid protocol possible
Applications Efficiency & Security Assumptions • Existing protocols: uniform trust assumption • All players are corrupted with the same probability • Damages caused by one corrupted player = another • A common mechanism to protect the weakest link against the most severe attacks • But players are heterogeneous in their trustworthiness, interests, and incentives etc. • Cooperation servers behind firewalls • Desktops maintained by high school kids • The collusion example
Applications Exploiting the Difference • Server is secure against outside attacks • Companies spend $$$ to protect their servers • The server often holds much more valuable info than what the protocol reveals • PPs won’t collude with the server • Interests conflicts, mutual distrust, laws • Server can’t trust clients can keep conspiracy secret • Server won’t corrupt client machines • Market force and laws • Rely on server for protection against outside attacks, PPs for defending against a curious server
Applications How to Compute Any Arithmetic Function – P4P Style • Each player secret shares her data among the server and one PP using (2, 2)-VSS • Server and PP convert to a HOMO-MPC for mult. Use VSS for addition. Result obtained by threshold decryption or secret reconstruction • Dealing with malicious adversary: cheating PP replaced by another • 2 << n! • Communication independent of n • Computation on talliers ~ fully distributed version
Applications Addition Only Algorithms • Although general computation made more efficient in P4P, multiplication still way more expensive than addition • A large number of practical algorithms can be implemented with addition only aggregation • Collaborative filtering [IEEESP02, SIGIR02] • HITS, PageRank … • E-M algorithm, HMM, most linear algebra algorithms …
Applications New Vector Addition Based MPC • User i has an m-dimensional vectordi, want to compute [y, A’] = F(Σi=1n di, A) • Goals • Privacy: no one learns di except user i • Correctness: computation should be verified • Validity: ||di||2 < L w.h.p.
Applications Cost for Private Computation: Vector Addition Only Cost for privacy/security Total computation cost Cost for computation on obfuscated data σC:O(mn) for both HOMO and VSS
Applications Cost for Private Computation: Vector Addition Only Cost for privacy/security Total computation cost O(nlogm) Cost for computation on obfuscated data The hidden const:HOMO: 10,000 VSS: 1 or 2 σC:O(mn) for both HOMO and VSS
Applications Basic Architecture ui vi ui + vi = di
Applications Basic Architecture μ = Σui ν = Σvi ui + vi = di
Applications Basic Architecture μ ν μ = Σui ν = Σvi ui + vi = di
Applications Basic Architecture [y, A’] = F(μ + ν, A)
Applications Adversary Models • Model 1: Any number of users can be corrupted by a malicious adversary; Both PP and the server can be corrupted by different semi-honest adversary • Model 2: Any number of users and the PP can be corrupted by a malicious adversary. The server can be corrupted by another malicious adversary who should not stop
Applications An Efficient Proof of Honesty • Show that some random projections of the user’s vector are small • If user fails T out of the N tests, reject his data • One proof/user vector and complexity O(logm)
Applications Success Probability
Applications Complexity and Cost • Only one proof for each user vector – no per-element proofs! • Computation size of sk: O(log m) • m = 106, l = 20, with N = 50, need 1420 exponentiations • ~ 5s/user Benchmark: http://botan.randombit.net/bmarks.html, 1.6 Ghz AMD Opteron (Linux, gcc 3.2.2)
Applications Service Provision with Privacy
Applications Existing Service Architecture
Applications Traditional Service Model • Requires or reveals private user info • Locations, IP addresses, the data downloaded • Requires user authentication • Subscription verification and billing purposes • Traditional client-server paradigm allows the server to link these two pieces of info • P4P keeps them separate
Applications P4P’s Service Model • Authenticates user • Anonymizes comm. • Processes the • transaction • PP knows user’s identity but not his data • Server knows user’s transaction but not his ID • To the PP: Transactions protected w/ crypto • To the server: Transactions unlinkable to each • other or to a particular user
Applications Possible Issues • The scheme involves multiple parties, why would they cooperate? • Server’s concerns and fears: Privacy peers are assigned the task of user authentication, how could the server trust the privacy peers? • Can the server block the PPs? • How to motivate the privacy peers? • How do we detect and trace any fraud?
Applications Solutions • Mechanism to detect fraud and trace faulty players • PP incentive: Rely on altruism or mechanism to credit the PPs • (An extreme) A fully P2P structure among the users and PPs • Server cannot isolate the PPs • Independence! • A partial P2P structure should work (e.g.5%PP)
Applications Billing Resolution • Fraud detection together with bill resolution • Have schemes for a number of billing models (flat-rate, pay-per-use) • No info about user’s transactions (except those of the faulty players) is leaked • An extension: PP replaced by a commercial privacy provider who does it for a profit • Now you can use its service and don’t have to be embarrassed by Amazon knowing the DVD title you buy • http://www.cs.berkeley.edu/~duan/research/qual/submitted/trustbus05.pdf
Conclusions • System design guidelines drawn from legal, economic and social science research • P4P argues for peer involvement and exploits the heterogeneity among the players and provides a viable framework for practical collaborative computation with privacy • P4P allows for private computation based on VSS – privacy offered in P4P almost for free!