620 likes | 860 Views
Motivation. General framework for describing computation between parties who do not trust each otherExample: electionsN parties, each one has a ?Yes" or ?No" voteGoal: determine whether the majority voted ?Yes", but no voter should learn how other people votedExample: auctionsEach bidder makes
E N D
1. Secure Multiparty Computation
2. Motivation General framework for describing computation between parties who do not trust each other
Example: elections
N parties, each one has a Yes or No vote
Goal: determine whether the majority voted Yes, but no voter should learn how other people voted
Example: auctions
Each bidder makes an offer
Offer should be committing! (cant change it later)
Goal: determine whose offer won without revealing losing offers
3. More Examples Example: distributed data mining
Two companies want to compare their datasets without revealing them
E.g., compute the intersection of two lists of names
Example: database privacy
Evaluate a query on the database without revealing the query to the database owner
Evaluate a statistical query on the database without revealing the values of individual entries
Many variations
4. A Couple of Observations In all cases, we are dealing with distributed multi-party protocols
A protocol describes how parties are supposed to exchange messages on the network
All of these tasks can be easily computed by a trusted third party
The goal of secure multi-party computation is to achieve the same result without involving a trusted third party
5. Problem (1/2)
6. Problem (2/2) A set of parties with private inputs wish to compute some joint function of their inputs.
Parties wish to preserve some security properties. E.g., privacy and correctness.
Example: secure election protocol
Security must be preserved in the face of adversarial behavior by some of the participants, or by an external party.
7. How to Define Security? Must be mathematically rigorous
Must capture all realistic attacks that a malicious participant may try to stage
Should be abstract
Based on the desired functionality of the protocol, not a specific protocol
Goal: define security for an entire class of protocols
8. Ideal Model Intuitively, we want the protocol to behave as if a trusted third party collected the parties inputs and computed the desired functionality
Computation in the ideal model is secure by definition!
9. Adversary Models Semi-honest model
All parties follow the protocol; but dishonest parties may be curious to violate others privacy.
Malicious model
Some parties are dishonest and can deviate from the protocol and behave arbitrarily.
We will focus on semi-honest adversaries and two-party protocols
10. Correctness and Security How do we argue that the real protocol emulates the ideal protocol?
Correctness
All honest participants should receive the correct result of evaluating function f
Because a trusted third party would compute f correctly
Security
All corrupt participants should learn no more from the protocol than what they would learn in ideal model
What does corrupt participant learn in ideal model?
His input (obviously) and the result of evaluating f
11. Simulation Corrupt participants view of the protocol = record of messages sent and received
In the ideal world, view consists simply of his input and the result of evaluating f
How to argue that real protocol does not leak more useful information than ideal-world view?
Key idea: simulation
If real-world view (i.e., messages received in the real protocol) can be simulated with access only to the ideal-world view, then real-world protocol is secure
Simulation must be indistinguishable from real view viewview
12. Yaos Theorem Secure Multiparty Computation is first introduced by Andrew Yao. Turing Awards.
For ANY efficiently computable function, there is a secure two-party protocol in the semi-honest model.
The first completeness theorem for secure computation.
In theory, there is no need to design protocols for specific functions. Surprising!
13. The Setting of Yaos theorem Alice has an input x.
Bob has an input y.
We need a way to evaluate f(x,y) such that
Alice and Bob only learn f(x,y)
14. Circuit Computation The design of Yaos protocol is based on circuit computation.
Any computable function can be represented as a family of boolean circuits.
(A theorem in Turing Machines)
Such a circuit consists of AND, OR, and NOT gates.
15. Garbled Circuit (1/2)
16. Yaos Protocol First, Convert the function into a boolean circuit
17. Example: Convert x=y to A Boolean Circuit Alices input: a bit value x
Bobs input: a bit value y
18. 1: Pick Random Keys For Each Wire Next, evaluate one gate securely
Later, generalize to the entire circuit
Alice picks two random keys for each wire
One key corresponds to 0, the other to 1
6 keys in total for a gate with 2 input wires
19. 2: Encrypt Truth Table Alice encrypts each row of the truth table by encrypting the output-wire key with the corresponding pair of input-wire keys
20. 3: Send Garbled Truth Table Alice randomly permutes (garbles) encrypted truth table and sends it to Bob
21. 4: Send Keys For Alices Inputs Alice sends the key corresponding to her input bit
Keys are random, so Bob does not learn what this bit is
22. How to retrieve Keys from Alice for Bobs Input
23. Oblivious Transfer (OT) Fundamental Secure Multiparty Computation primitive [Rabin 1981]
24. One-Way Trapdoor Functions Intuition: A one-way function F are easy to compute, but hard to invert (skip formal definition for now)
We will be interested in one-way permutations
Intuition: A one-way trapdoor function T are one-way functions that are easy to invert given extra information called the trapdoor
Example
If n=pq where p and q are large primes, e is relatively prime to ?(n), and de=1 mod ?(n), F and T can be defined as
F(x) = xe mod n and T(x) = (x)d mod n. We have
T( F(x) ) = (xe)d mod n = x mod n = x
25. Oblivious Transfer Protocol Assume the existence of some family of one-way trapdoor permutations
26. 5: Use Oblivious Transfer on Keys for Bobs Input Alice and Bob run oblivious transfer protocol
Alices input is the two keys corresponding to Bobs wire
Bobs input into OT is simply his 1-bit input on that wire
27. 6: Evaluate Garbled Gate Using the two keys that he learned, Bob decrypts exactly one of the output-wire keys
Bob does not learn if this key corresponds to 0 or 1
Why is this important?
28. In this way, Bob evaluates entire garbled circuit
For each wire in the circuit, Bob learns only one key
It corresponds to 0 or 1 (Bob does not know which)
Therefore, Bob does not learn intermediate values (why?)
Bob tells Alice the key for the final output wire and she tells him if it corresponds to 0 or 1
Bob does not tell her intermediate wire keys (why?) 7: Evaluate Entire Circuit
29. Brief Discussion of Yaos Protocol Function must be converted into a circuit
For many functions, circuit will be huge
If m gates in the circuit and n inputs, then need 4m encryptions and n oblivious transfers
Oblivious transfers for all inputs can be done in parallel
Yaos construction gives a constant-round protocol for secure computation of any function in the semi-honest model
Number of rounds does not depend on the number of inputs or the size of the circuit!
30. Privacy and Integrity PreservingRange Queries in Sensor Networks
31. Wireless Sensor Networks (1/2) One way to store the data and process users queries.
Each sensor sends data to the sink. The sink stores the data and processes queries.
Drawbacks
Sending data or result from sensor to the sink is power consuming due to multi-hop transmission
32. Wireless Sensor Networks (2/2) Another way to store the data and process users queries.
Each sensor stores data. Upon receiving a query, the sink searches all sensors.
Drawbacks
Sensors should have large memory space.
33. Two-tiered Sensor Network A two-tier sensor network [Ratnasamy et al. 2003]
Benefits
Power saving for sensors
Memory saving for sensors
Query processing is efficient
Several products of storage nodes, such as StarGate and RISE, are commercially available
34. Storage nodes can be compromised Storage nodes are attractive to be attacked
Sensitive data collected by sensors are stored in storage nodes
It raises two security problems if a storage node is compromised
How to preserve the privacy of sensor collected data and sink issued queries?
How to preserve the integrity of query result?
35. Problem assumption:
Assume that all sensor nodes and storage nodes are loosely synchronized with the sink.
We divide time into time slots with fixed period.
Each sensor collects and sends n data items per time slot, (i, t, {d1, d2,
, dn}), where i is the sensor ID and t is the sequence number of the time slot. Problem: Privacy and Integrity Preserving Range Queries (1/2)
36. Preserving privacy
A compromised storage node cannot gain information from sensor collected data and sink issued queries
A storage node can perform query processing
Preserving integrity
The sink can detect whether a query result from a storage node
includes forged data items
excludes any data items that satisfy the query
Problem: Privacy and Integrity Preserving Range Queries (2/2)
37. Privacy Preserving Scheme (1/2) To protect the privacy of sensor collected data
Encrypt each data item individually
How does a storage node process a query over encrypted data?
Let us first Consider a simplified problem:
How to compare two numbers x = y in a privacy preserving manner?
38. Privacy Preserving Scheme (2/2) How does a storage node process a query over encrypted data?
Convert the problem of checking di ? [a, b] to that of checking x = y
Using prefix membership verification technique
39. Integrity Preserving Scheme (1/2) Neighborhood Chaining
Encrypt the data item with its neighbors
40. Integrity Preserving Scheme (2/2) Do we really need bi-directional chaining?
No, we only need to build a one directional neighborhood chain.
41. What if the query result is empty? Storage node only knows that no data item satisfies the query
It doesnt know which is the verification object
42. Privacy Preserving Scheme V2 How does a storage node process a query over encrypted data?
43. Multi-dimensional Data (1/2) To preserve privacy, we apply our 1-dimensional privacy preserving techniques to each dimension of multi-dimensional data.
To preserve integrity, we build a multi-dimensional neighborhood chain.
44. Multi-dimensional Data (2/2)
45. Range Queries in Event-driven Networks (1/2) We have assumed that at each time slot, a sensor sends data to a storage node.
However, in event-driven networks, a sensor only reports data to a storage node when certain event happens.
46. Range Queries in Event-driven Networks (2/2) Our idea:
Sensors report their idle period to the storage node when one of following two conditions holds:
Sensors submit data after an idle period
The idle period is longer than a threshold, say ?
47. Security Analysis Privacy
Without knowing the keys used in the encryption and secure hashing, it is computationally infeasible to compute the actual values of sensor collected data and the corresponding prefixes.
Integrity
Query result and verification object should satisfy three properties:
1. Items in query result and verification object form a chain.
Excluding any item in the middle or changing any item violates this property.
2. The first item contains the value of its left neighbor, which should be out of the range query on the smaller end.
3. The last item in contains the value of its right neighbor, which should be out of the range query on the larger end.
48. Complexity Analysis Given n z-dimensional data items that a sensor collects in a time slot, the computation cost, communication cost, and storage space of SafeQ are shown as follows.
49. Experimental Results: Evaluation Setup We implemented both SafeQ and Sheng&Li (prior art) schemes using TOSSIM
We measured the efficiency of SafeQ and Sheng&Li schemes on 1, 2, and 3 dimensional data.
We conducted our experiments on the same data set that Sheng&Li used in their experiment
We used HMAC-MD5 with 128-bit keys as the hash function for hashing prefix numbers.
We chose the number of hash functions to be 4, which guarantees that the false positive rate is less than 1%
We experimented with different sizes of time slots ranging from 10 minutes to 80 minutes.
For each time slot, we generated 1,000 random range queries.
50. Experimental Results: 3-dimensional Data (1/2) In terms of power consumption, for 3-dimensional data
SafeQ-Bloom is 184.9 times less power for sensors and 76.8 times less power for storage nodes
SafeQ-Basic is 59.2 times less power for sensors and 76.8 times less power for storage nodes
51. Experimental Results: 3-dimensional Data (2/2) In terms of space consumption, for 3-dimensional data
SafeQ-Bloom is 182.4 times less space for storage nodes
SafeQ-Basic is 58.5 times less space for storage nodes
52. Prior work (1/5) Sheng&Li scheme [Infocom 2008]
53. Prior work (2/5) Two major drawbacks of Sheng&Li scheme [Infocom 2008]
Fairly accurate estimating data items and queries [Hore et al. VLDB 2004]
Power and space consumption grows exponentionally with the number of dimensions.
54. Prior work (3/5) Shi et al. scheme [Infocom 2009]
To preserve privacy, they use the same privacy scheme of Sheng&Li scheme
To preserve integrity, they propose a scheme that distributing the bucket vector of a sensor to its nearby sensors such that the sink can verify the integrity using the bucket vector
55. Zhang et al. scheme [MobiHoc 2009]
Zhang et al. scheme extends Shi et al. scheme for supporting multi-dimensional data Prior work (4/5)
56. Prior work (5/5) Two major drawbacks of Shi et al. scheme and Zhang et al.s scheme
A compromised sensor could easily compromise the integrity verification functionality of the network by sending falsified bucket vectors to other sensors and storage nodes.
If Si is compromised, Si can send a faked Vi , i.e., Vi, to Sj
If Sj is compromised, Sj can change Vi (received from Si) to a faked Vi
The sink cannot distinguish the above two cases.
Fairly accurate estimating data items and quires [Hore et al. VLDB 2004]
57. Contributions Propose a novel privacy and integrity preserving range query protocol for two-tiered sensor networks
Propose an optimization technique using Bloom filters to significantly reduce the communication cost between sensors and storage nodes
Propose a solution for event-driven sensor networks
58. Questions