320 likes | 393 Views
Plausible deniability in an interest-based P2P network Josep Pegueroles Universitat Politècnica de Catalunya. Searches of documents in P2P networks with these characteristics: Based on the self-declared interests of a user. Take advantage of the “small world” behavior of social networks.
E N D
Plausible deniability in an interest-based P2P network Josep Pegueroles Universitat Politècnica de Catalunya
Searches of documents in P2P networks with these characteristics: Based on the self-declared interests of a user. Take advantage of the “small world” behavior of social networks. Creating clusters of users based on common interests. Interest based searches
Social model • A document can be described as a vector: • Bag of words: frequency of words inside a document [Manning09] • Bag of concepts: frequency of semantic descriptions of words inside a document [Thiagarajan08]
Social model: assumptions • There is a metric for affinities. This metric must include only additions and multiplications [Resnick94] • Users often ask for documents related to their profile. Queries could be far away from each other!
Network model: epidemic routing • It is not flooding!
Privacy: Profiles include lots of information about a user Queries include profiles Legal Databases provides access to data. Case RapidShare. Intermediate nodes provides routes to data. Cases Pirate Bay, ShareMula. Security problems
Searches based on interests where none of the nodes can be found liable of providing access to data. Objective
Building blocks • Protection of users in clusters: consistent false negatives • Protection of profiles that are in clear: Random projections • Protection of queries: homomorphic encryption. • Protection of nodes: anonymous comms • Protection of databases: • Private Block Retrival • Bloom filters
P2P network in clusters Documents of a cluster are stored in the other cluster.
Consistent profiles • If all profiles in a cluster are close to each other, an attacker could fake his profile and collect lots of users. • Solution: different profiles in the same cluster. Pa? Pb?
Consistent profiles: long term • Even if communications are anonymous, analyzing the content of the messages in the long term gives strong evidence about the neighbor's profile. • Solution: secure message content t0=Qa t1=Qa t2=Qb t3=Qa ... Pa Pb?
Given two numbers [x],[y], encrypted with K, and z a number in clear, anyone could calculate these encryptions without K: [x+y] [zy] Examples: ElGamal, Pallier. Homomorphic ciphering
A secure metric for profiles • Given the projected profile encrypted with the private key of a user: • It is possible to calculate the cosine distance to an encrypted profile as: • Problem: the other profile must be in clear!
P2P network in clusters Documents of a cluster are stored in the other cluster.
Simplified system • Anonymous routing in the cluster that searches and epidemic in the cluster that store information. Problems: • Oblivious queries to databases • Protection of database identity Objective: database deniability
Projections • If m<n/2, they are suitable for security: • If m<n/2, it is not possible to separate any component of the projection [Liu-Ryan 06] • Projections maintain distances: [Johnson-Lindestrauss 99]
Projections: matrices • Wait: what about triangulations?
Guessing profiles: Montecarlo • Dmax= 1.5 Hmax=6,6b, Attacks at 0.1, H=5,8b
Guessing profiles • Dmax= 1.5 Hmax=6,6b, Attacks at 0.2, H=6.2b
Guessing profiles • Dmax= 1.5 Hmax=6,6b, Attacks at 0.5, H=6,5b
Guessing profiles • Two kind of confidences in the result: • Confidence in the guessing itself • Confidence in the result of the guessing Attackers have to be very close to a node to guess the profile. Brute force attacks?
The DB stores pairs (projected profile, URL) for each document. The projection cannot be easily inverted (plausible deniability). But the database must return a URL! Solution: Private Block Retrieval Databases
Private Block Retrieval • Oblivious SELECT in a database [Gentry and Ramzam, 2005] • Used decides which blocks he wants, i • User sends and “oblivious index” to the DB, I(i) • The DB calculates a special block B(I(i)) from every block in the DB. • From B(I(i)), the user can extract the block i.
Private Block Retrieval Protocol • A user sends to the DB: • The DB calculates for each document • The DB sends: • The user decrypts the distances, and picks up the closest i • PBR protocol between BD and i
Analysis so far • The DB cannot invert projections. • Nodes in the path know nothing about queries • DB and nodes in the path knows nothing about the selected answer. • The last node in the path do knows which DB answers.
Epidemic DB protection • Distributed Private Block Retrieval system: • Bloom filters to remove duplicates • Local permutations in each DB to prevent identification
Open questions • Number or different profiles in the cluster? • DPBR vs Efficiency? • Downloading?