Secure Multiparty Computation

1. Secure Multiparty Computation

2. Motivation General framework for describing computation between parties who do not trust each other Example: elections N parties, each one has a �Yes� or �No� vote Goal: determine whether the majority voted �Yes�, but no voter should learn how other people voted Example: auctions Each bidder makes an offer Offer should be committing! (can�t change it later) Goal: determine whose offer won without revealing losing offers

3. More Examples Example: distributed data mining Two companies want to compare their datasets without revealing them E.g., compute the intersection of two lists of names Example: database privacy Evaluate a query on the database without revealing the query to the database owner Evaluate a statistical query on the database without revealing the values of individual entries Many variations

4. A Couple of Observations In all cases, we are dealing with distributed multi-party protocols A protocol describes how parties are supposed to exchange messages on the network All of these tasks can be easily computed by a trusted third party The goal of secure multi-party computation is to achieve the same result without involving a trusted third party

5. Problem (1/2)

6. Problem (2/2) A set of parties with private inputs wish to compute some joint function of their inputs. Parties wish to preserve some security properties. E.g., privacy and correctness. Example: secure election protocol Security must be preserved in the face of adversarial behavior by some of the participants, or by an external party.

7. How to Define Security? Must be mathematically rigorous Must capture all realistic attacks that a malicious participant may try to stage Should be �abstract� Based on the desired �functionality� of the protocol, not a specific protocol Goal: define security for an entire class of protocols

8. Ideal Model Intuitively, we want the protocol to behave �as if� a trusted third party collected the parties� inputs and computed the desired functionality Computation in the ideal model is secure by definition!

9. Adversary Models Semi-honest model All parties follow the protocol; but dishonest parties may be curious to violate others� privacy. Malicious model Some parties are dishonest and can deviate from the protocol and behave arbitrarily. We will focus on semi-honest adversaries and two-party protocols

10. Correctness and Security How do we argue that the real protocol �emulates� the ideal protocol? Correctness All honest participants should receive the correct result of evaluating function f Because a trusted third party would compute f correctly Security All corrupt participants should learn no more from the protocol than what they would learn in ideal model What does corrupt participant learn in ideal model? His input (obviously) and the result of evaluating f

11. Simulation Corrupt participant�s view of the protocol = record of messages sent and received In the ideal world, view consists simply of his input and the result of evaluating f How to argue that real protocol does not leak more useful information than ideal-world view? Key idea: simulation If real-world view (i.e., messages received in the real protocol) can be simulated with access only to the ideal-world view, then real-world protocol is secure Simulation must be indistinguishable from real view viewview

12. Yao�s Theorem Secure Multiparty Computation is first introduced by Andrew Yao. Turing Awards. For ANY efficiently computable function, there is a secure two-party protocol in the semi-honest model. The first completeness theorem for secure computation. In theory, there is no need to design protocols for specific functions. Surprising!

13. The Setting of Yao�s theorem Alice has an input x. Bob has an input y. We need a way to evaluate f(x,y) such that Alice and Bob only learn f(x,y)

14. Circuit Computation The design of Yao�s protocol is based on circuit computation. Any computable function can be represented as a family of boolean circuits. (A theorem in Turing Machines) Such a circuit consists of AND, OR, and NOT gates.

15. Garbled Circuit (1/2)

16. Yao�s Protocol First, Convert the function into a boolean circuit

17. Example: Convert x=y to A Boolean Circuit Alice�s input: a bit value x Bob�s input: a bit value y

18. 1: Pick Random Keys For Each Wire Next, evaluate one gate securely Later, generalize to the entire circuit Alice picks two random keys for each wire One key corresponds to �0�, the other to �1� 6 keys in total for a gate with 2 input wires

19. 2: Encrypt Truth Table Alice encrypts each row of the truth table by encrypting the output-wire key with the corresponding pair of input-wire keys

20. 3: Send Garbled Truth Table Alice randomly permutes (�garbles�) encrypted truth table and sends it to Bob

21. 4: Send Keys For Alice�s Inputs Alice sends the key corresponding to her input bit Keys are random, so Bob does not learn what this bit is

22. How to retrieve Keys from Alice for Bob�s Input

23. Oblivious Transfer (OT) Fundamental Secure Multiparty Computation primitive [Rabin 1981]

24. One-Way Trapdoor Functions Intuition: A one-way function F are easy to compute, but hard to invert (skip formal definition for now) We will be interested in one-way permutations Intuition: A one-way trapdoor function T are one-way functions that are easy to invert given extra information called the trapdoor Example If n=pq where p and q are large primes, e is relatively prime to ?(n), and de=1 mod ?(n), F and T can be defined as F(x) = xe mod n and T(x) = (x)d mod n. We have T( F(x) ) = (xe)d mod n = x mod n = x

25. Oblivious Transfer Protocol Assume the existence of some family of one-way trapdoor permutations

26. 5: Use Oblivious Transfer on Keys for Bob�s Input Alice and Bob run oblivious transfer protocol Alice�s input is the two keys corresponding to Bob�s wire Bob�s input into OT is simply his 1-bit input on that wire

27. 6: Evaluate Garbled Gate Using the two keys that he learned, Bob decrypts exactly one of the output-wire keys Bob does not learn if this key corresponds to 0 or 1 Why is this important?

28. In this way, Bob evaluates entire garbled circuit For each wire in the circuit, Bob learns only one key It corresponds to 0 or 1 (Bob does not know which) Therefore, Bob does not learn intermediate values (why?) Bob tells Alice the key for the final output wire and she tells him if it corresponds to 0 or 1 Bob does not tell her intermediate wire keys (why?) 7: Evaluate Entire Circuit

29. Brief Discussion of Yao�s Protocol Function must be converted into a circuit For many functions, circuit will be huge If m gates in the circuit and n inputs, then need 4m encryptions and n oblivious transfers Oblivious transfers for all inputs can be done in parallel Yao�s construction gives a constant-round protocol for secure computation of any function in the semi-honest model Number of rounds does not depend on the number of inputs or the size of the circuit!

30. Privacy and Integrity PreservingRange Queries in Sensor Networks

31. Wireless Sensor Networks (1/2) One way to store the data and process users� queries. Each sensor sends data to the sink. The sink stores the data and processes queries. Drawbacks Sending data or result from sensor to the sink is power consuming due to multi-hop transmission

32. Wireless Sensor Networks (2/2) Another way to store the data and process users� queries. Each sensor stores data. Upon receiving a query, the sink searches all sensors. Drawbacks Sensors should have large memory space.

33. Two-tiered Sensor Network A two-tier sensor network [Ratnasamy et al. 2003] Benefits Power saving for sensors Memory saving for sensors Query processing is efficient Several products of storage nodes, such as StarGate and RISE, are commercially available

34. Storage nodes can be compromised Storage nodes are attractive to be attacked Sensitive data collected by sensors are stored in storage nodes It raises two security problems if a storage node is compromised How to preserve the privacy of sensor collected data and sink issued queries? How to preserve the integrity of query result?

35. Problem assumption: Assume that all sensor nodes and storage nodes are loosely synchronized with the sink. We divide time into time slots with fixed period. Each sensor collects and sends n data items per time slot, (i, t, {d1, d2,�, dn}), where i is the sensor ID and t is the sequence number of the time slot. Problem: Privacy and Integrity Preserving Range Queries (1/2)

36. Preserving privacy A compromised storage node cannot gain information from sensor collected data and sink issued queries A storage node can perform query processing Preserving integrity The sink can detect whether a query result from a storage node includes forged data items excludes any data items that satisfy the query Problem: Privacy and Integrity Preserving Range Queries (2/2)

37. Privacy Preserving Scheme (1/2) To protect the privacy of sensor collected data Encrypt each data item individually How does a storage node process a query over encrypted data? Let us first Consider a simplified problem: How to compare two numbers x = y in a privacy preserving manner?

38. Privacy Preserving Scheme (2/2) How does a storage node process a query over encrypted data? Convert the problem of checking di ? [a, b] to that of checking x = y Using prefix membership verification technique

39. Integrity Preserving Scheme (1/2) Neighborhood Chaining Encrypt the data item with its neighbors

40. Integrity Preserving Scheme (2/2) Do we really need bi-directional chaining? No, we only need to build a one directional neighborhood chain.

41. What if the query result is empty? Storage node only knows that no data item satisfies the query It doesn�t know which is the verification object

42. Privacy Preserving Scheme V2 How does a storage node process a query over encrypted data?

43. Multi-dimensional Data (1/2) To preserve privacy, we apply our 1-dimensional privacy preserving techniques to each dimension of multi-dimensional data. To preserve integrity, we build a multi-dimensional neighborhood chain.

44. Multi-dimensional Data (2/2)

45. Range Queries in Event-driven Networks (1/2) We have assumed that at each time slot, a sensor sends data to a storage node. However, in event-driven networks, a sensor only reports data to a storage node when certain event happens.

46. Range Queries in Event-driven Networks (2/2) Our idea: Sensors report their idle period to the storage node when one of following two conditions holds: Sensors submit data after an idle period The idle period is longer than a threshold, say ?

47. Security Analysis Privacy Without knowing the keys used in the encryption and secure hashing, it is computationally infeasible to compute the actual values of sensor collected data and the corresponding prefixes. Integrity Query result and verification object should satisfy three properties: 1. Items in query result and verification object form a chain. Excluding any item in the middle or changing any item violates this property. 2. The first item contains the value of its left neighbor, which should be out of the range query on the smaller end. 3. The last item in contains the value of its right neighbor, which should be out of the range query on the larger end.

48. Complexity Analysis Given n z-dimensional data items that a sensor collects in a time slot, the computation cost, communication cost, and storage space of SafeQ are shown as follows.

49. Experimental Results: Evaluation Setup We implemented both SafeQ and Sheng&Li (prior art) schemes using TOSSIM We measured the efficiency of SafeQ and Sheng&Li schemes on 1, 2, and 3 dimensional data. We conducted our experiments on the same data set that Sheng&Li used in their experiment We used HMAC-MD5 with 128-bit keys as the hash function for hashing prefix numbers. We chose the number of hash functions to be 4, which guarantees that the false positive rate is less than 1% We experimented with different sizes of time slots ranging from 10 minutes to 80 minutes. For each time slot, we generated 1,000 random range queries.

50. Experimental Results: 3-dimensional Data (1/2) In terms of power consumption, for 3-dimensional data SafeQ-Bloom is 184.9 times less power for sensors and 76.8 times less power for storage nodes SafeQ-Basic is 59.2 times less power for sensors and 76.8 times less power for storage nodes

51. Experimental Results: 3-dimensional Data (2/2) In terms of space consumption, for 3-dimensional data SafeQ-Bloom is 182.4 times less space for storage nodes SafeQ-Basic is 58.5 times less space for storage nodes

52. Prior work (1/5) Sheng&Li scheme [Infocom 2008]

53. Prior work (2/5) Two major drawbacks of Sheng&Li scheme [Infocom 2008] Fairly accurate estimating data items and queries [Hore et al. VLDB 2004] Power and space consumption grows exponentionally with the number of dimensions.

54. Prior work (3/5) Shi et al. scheme [Infocom 2009] To preserve privacy, they use the same privacy scheme of Sheng&Li scheme To preserve integrity, they propose a scheme that distributing the bucket vector of a sensor to its nearby sensors such that the sink can verify the integrity using the bucket vector

55. Zhang et al. scheme [MobiHoc 2009] Zhang et al. scheme extends Shi et al. scheme for supporting multi-dimensional data Prior work (4/5)

56. Prior work (5/5) Two major drawbacks of Shi et al.� scheme and Zhang et al.�s scheme A compromised sensor could easily compromise the integrity verification functionality of the network by sending falsified bucket vectors to other sensors and storage nodes. If Si is compromised, Si can send a faked Vi , i.e., Vi�, to Sj If Sj is compromised, Sj can change Vi (received from Si) to a faked Vi� The sink cannot distinguish the above two cases. Fairly accurate estimating data items and quires [Hore et al. VLDB 2004]

57. Contributions Propose a novel privacy and integrity preserving range query protocol for two-tiered sensor networks Propose an optimization technique using Bloom filters to significantly reduce the communication cost between sensors and storage nodes Propose a solution for event-driven sensor networks

58. Questions

Secure Multiparty Computation

Secure Multiparty Computation

Presentation Transcript

Survey: Secure Composition of Multiparty Protocols

Secure Multiparty Computation and Privacy

Multiparty Unconditionally Secure Protocols

Practical Cryptographic Secure Computation

Secure Multiparty Computation and its Applications

Secure Multiparty Computation s elected definitional notions

Secure Multiparty Computation

Secure Multiparty Computation – Basic Cryptographic Methods

Secure Computation

Survey: Secure Composition of Multiparty Protocols

Multiparty Computation with Low Communication, Computation and Interaction via Threshold FHE

Multiparty Computation Ivan Damgård BRICS, Århus University

Randomization Techniques for Multiparty Computation

Hidden Diversity and Secure Multiparty Computation

Randomization Techniques for Multiparty Computation

Scalable Secure Distributed Computation

Scaling Secure Computation

Multiparty Computation Ivan Damgård BRICS, Århus University

Secure Multiparty Computation Ivan Damgård, Department of Computer Science, Aarhus Universitet