P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy

P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy Yitao Duan Berkeley Institute of Design UC Berkeley Qualifying Exam April 18, 2005

Outline • Problem and motivation • Privacy issues examined • Privacy is never a purely tech issue • Derive some design principles • The P4P framework • Applications • Practical multiparty arithmetic computation with privacy • Service provision with privacy • Progress and future work

Problem Scenario

Applications and Motivation • “Next generation search” makes heavy use of personal data for customized search, context-awareness, expertise mining and collaborative filtering • E-commerce vendors (like Amazon) try to build user purchase profiles across markets. And user profiling is moving onto the desktop • Location based services, real-world monitoring …

Legal Perspectives • Privacy issues arise as a tension between two parties: one seeks info about the other • Identity of the seeker leads to different situations and precedents • E.g. individual vs, the press, vs. the employer • Power imbalance between the two • Loss of privacy often leads to real harm: e.g. loss of job, loss of right, etc. [AK95]

Economic Perspectives • Market forces work against customer privacy • Company has to do extra work to get less info • Company can benefit from having user info • So they lack the incentive to adopt PETs • Power imbalance (again!) in e-commerce • But we, as users, can make a difference by flexing our collective muscles! • Users often underestimate the risk of privacy intrusion and are unwilling to pay for PET [FFSS02,ODL02, A04]

Social Science Perspectives • Privacy is NOT minimizing disclosure • Maintaining a degree of privacy often requires disclosure of personal information [Altman 75] • E.g. faculty members put “Perspective students please read this before you email me …” on their web page • Sociality requires free exchange of some information • PET should not prevent normal exchange

Lessons for Designing Practical Systems • Almost all problems are preserved, or even exaggerated in computing • Tension exists but court arbitration not available • Power imbalance prevails with no protection of the weak – client/server paradigm • Lack of incentive (to adopt PET, to cooperate, etc) • Design constraints for practical PET • Cost of privacy must be close to 0. And the privacy scheme must not conflict with the powerful actor’s need

The P4P Philosophy You can’t wait for privacy to be granted. One has to fight for it.

P4P: Π2 Principles • Prevention: Not deterrence • Incentive: Design should consider the incentives of the participants • Protection: Design should incorporate mechanisms that protect the weak parties • Independence: The protection should be effective even if some parties do not cooperate

u Topologies S P2P Client-server

Problems With the Two Paradigms • Client-server • Power imbalance • Lack of incentive • P2P • Doesn’t always match all the transactions models (e.g. buying PCs from Dell) • Hides the heterogeneity • Many efficient server-based computation are too expensive if done P2P

The P4P Architecture Privacy Peer (PP) • A subset of users are elected as “privacy • providers” (called privacy peers) within the group • PPs provide privacy when they are available, but • can’t access data themselves

P4P Basics • Server is (almost) always available but PPs aren’t (but should be periodically) – asynchronous or semi-synchronous protocols • Server provides data archival, and synchronizes the protocol • Server only communicates with PPs occasionally (when they are online and light-loaded eg 2AM) • Server can often be trusted not to bias the computation – but we have means to verify it • PPs and all other user are completed untrusted

P2P: 70% of the users are free riding The Half-Full/Half-Empty Glass In a typical P2P system, 5% of the peers provide 70% of the services [GFS] • P4P: 5+% of the users are serving the community Enough for P4P to work practically!

Roles of the Privacy Peers • Anonymizing Communication • E.g. Anonymizer.com or Mix • Offloading the Server • Sharing Information • Participating in Computation • Others Infrastructure Support

Tools and Services • Cryptographic tools: Commitment, VSS, ZKP, Anonymous authentication, eCash, etc • Anonymous Message Routing • E.g. MIX network [CHAUM] • Data protection scheme [PET04] Λ: the set of users whom should have access to X • Anonymous SSL

Applications Practical Multiparty Arithmetic Computation with Privacy

Applications Multiparty Computation • n parties with private inputs wish to compute some joint function of their inputs • Must preserve security properties. E.g., privacy and correctness • Adversary: participants or external • Semi-honest: follows the protocol but curious • Malicious: can behave arbitrarily

Applications MPC – Known Results • Computational Setting: Trapdoor permutations • Any two-party function can be securely computed in the semi-honest model [Yao] • Any multiparty function can be securely computed in the malicious model, for any number of corrupted parties [GMW] • Info-Theoretic Setting: No complexity assumption • Any multiparty function can be securely computed in the malicious model if 2/3n honest parties [BGW,CCD] • With broadcast channel, only >1/2n honest parties[RB]

Applications A Solved Problem? • Boolean circuit based protocols totally impractical • Arithmetic better but still expensive: the best protocols have O(n3) complexity to deal with active adversary • Can’t be used directly in real systems with large scale: 103 ~ 106 users each with 103 ~ 106 data items

Applications Contributions to Practical MPC • P4P provides a setting where generic arithmetic MPC protocols can be run much more efficiently • Existing protocols (the best one): O(n3) complexity (malicious model) • P4P allows to reduce n without sacrificing security • Enables new protocols to make a whole class of computation practical

Applications Arithmetic: Homomorphism vs VSS • Homomorphism: E(a)E(b) = E(a+b) • Verifiable Secret Sharing (VSS): aa1, a2, … an • Addition easy • E(a)E(b) = E(a+b) • share(a) + share(b) = share(a+b) • Multiplication more involved for both • HOMO-MPC: O(n3) w/ big constant [CDN01, DN03] • VSS-MPC: O(n4) (e.g. [GRR98])

Applications Arithmetic: Homomorphism vs VSS • HOMO-MPC + Can tolerate t < n corrupted players as far as privacy is concerned • Use public key crypto, 10,000x more expensive than normal arithmetic (even for addition) • Requires large fields (e.g. 1024 bit) • VSS-MPC + Addition is essentially free + Can use any size field - Can’t tolerate t > n/2 corrupted players (can’t do two party multiplication)

Applications Efficiency & Security Assumptions • Existing protocols: uniform trust assumption • All players are corrupted with the same probability • Damages caused by one corrupted player = another • A common mechanism to protect the weakest link against the most severe attacks • But players are heterogeneous in their trustworthiness, interests, and incentives etc. • Cooperation servers behind firewalls • Desktops maintained by high school kids • The collusion example

Applications Exploiting the Difference • Server is secure against outside attacks • Companies spend $$$ to protect their servers • The server often holds much more valuable info than what the protocol reveals • PPs won’t collude with the server • Interests conflicts, mutual distrust, laws • Server can’t trust clients can keep conspiracy secret • Server won’t corrupt client machines • Market force and laws • Rely on server for protection against outside attacks, PPs for defending against a curious server

Applications Addition Only Algorithms • Although general computation made more efficient in P4P, multiplication still way more expensive than addition • A large number of practical algorithms can be implemented with addition only aggregation • Collaborative filtering [IEEESP02, SIGIR02] • HITS, PageRank … • E-M algorithm, HMM, most linear algebra algorithms …

Applications New Vector Addition Based MPC • User i has an m-dimensional vectordi, want to compute [y, A’] = F(Σi=1n di, A) • Goals • Privacy: no one learns di except user i • Correctness: computation should be verified • Validity: ||di||2 < L w.h.p.

Applications Cost for Private Computation: Vector Addition Only Cost for privacy/security Total computation cost Cost for computation on obfuscated data σC:O(mn) for both HOMO and VSS

Applications Cost for Private Computation: Vector Addition Only Cost for privacy/security Total computation cost O(nlogm) Cost for computation on obfuscated data The hidden const:HOMO: 10,000 VSS: 1 or 2 σC:O(mn) for both HOMO and VSS

Applications Basic Architecture ui vi ui + vi = di

Applications Basic Architecture μ = Σui ν = Σvi ui + vi = di

Applications Basic Architecture μ ν μ = Σui ν = Σvi ui + vi = di

Applications Basic Architecture [y, A’] = F(μ + ν, A)

Applications Adversary Models • Model 1: Any number of users can be corrupted by a malicious adversary; Both PP and the server can be corrupted by different semi-honest adversary • Model 2: Any number of users and the PP can be corrupted by a malicious adversary. The server can be corrupted by another malicious adversary who should not stop

Applications An Efficient Proof of Honesty • Show that some random projections of the user’s vector are small • If user fails T out of the N tests, reject his data • One proof/user vector and complexity O(logm)

Applications Success Probability

Applications Complexity and Cost • Only one proof for each user vector – no per-element proofs! • Computation  size of sk: O(log m) • m = 106,  l = 20, with N = 50, need 1420 exponentiations • ~ 5s/user Benchmark: http://botan.randombit.net/bmarks.html, 1.6 Ghz AMD Opteron (Linux, gcc 3.2.2)

Applications Service Provision with Privacy

Applications Existing Service Architecture

Applications Traditional Service Model • Requires or reveals private user info • Locations, IP addresses, the data downloaded • Requires user authentication • Subscription verification and billing purposes • Traditional client-server paradigm allows the server to link these two pieces of info • P4P keeps them separate

Applications P4P’s Service Model • Authenticates user • Anonymizes comm. • Processes the • transaction • PP knows user’s identity but not his data • Server knows user’s transaction but not his ID • To the PP: Transactions protected w/ crypto • To the server: Transactions unlinkable to each • other or to a particular user

Applications Possible Issues • The scheme involves multiple parties, why would they cooperate? • Server’s concerns and fears: Privacy peers are assigned the task of user authentication, how could the server trust the privacy peers? • Can the server block the PPs? • How to motivate the privacy peers? • How do we detect and trace any fraud?

Applications Solutions • Mechanism to detect fraud and trace faulty players • PP incentive: Rely on altruism or mechanism to credit the PPs • (An extreme) A fully P2P structure among the users and PPs • Server cannot isolate the PPs • Independence! • A partial P2P structure should work (e.g.5%PP)

Applications Billing Resolution • Fraud detection together with bill resolution • Have schemes for a number of billing models (flat-rate, pay-per-use) • No info about user’s transactions (except those of the faulty players) is leaked • An extension: PP replaced by a commercial privacy provider who does it for a profit • Now you can use its service and don’t have to be embarrassed by Amazon knowing the DVD title you buy • http://www.cs.berkeley.edu/~duan/research/qual/submitted/trustbus05.pdf

Conclusions • System design guidelines drawn from legal, economic and social science research • P4P argues for peer involvement and exploits the heterogeneity among the players and provides a viable framework for practical collaborative computation with privacy • P4P allows for private computation based on VSS – privacy offered in P4P almost for free!

Progress So Far • Published work: • Data protection – PET04 • Link analysis – SIAM Link Analysis Workshop • Submitted: • Group Communication Cryptosystem • Service Provision with Privacy • In progress: • Practical Vector Addition Based Computation • Hybrid MPC • Anonymous SSL

Plan and Future Work • Finish the work at hand • Extend the practical computation to support multiplication? • Hybrid: Homomorphism and VSS based scheme • VSS: Efficient multiplication possible if we can have 3 non-colluding players (another server? Another PP?) • More applications? • Implementation • A P4P toolkit or lib that developers can use to built their application • Time to graduate: 12 to 18 months

P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy