360 likes | 575 Views
On the Anonymity of Anonymity Systems. Andrei Serjantov schnur@gmail.com (anonymous). Outline. Anonymity informally Anonymity Properties Anonymity of Existing Implementations Analysis Probability, Entropy Attacks Low Latency Intersection Conclusion. What is Anonymity?.
E N D
On the Anonymity of Anonymity Systems Andrei Serjantov schnur@gmail.com (anonymous)
Outline • Anonymity informally • Anonymity Properties • Anonymity of Existing Implementations • Analysis • Probability, Entropy • Attacks • Low Latency • Intersection • Conclusion
What is Anonymity? Actually, we assume humans are tied to computers and anonymize those Anonymity does not hide the presence of the individuals/computers just their identity
Anonymity System This guy does not even know he is on the internet(!)
Anonymity Properties I:Receiver Untraceability A B Receivers are not observable – ie the attacker does not know if B received a message Senders are observable – i.e. the attacker knows that A sent a message to someone
Anonymity Properties II:Sender Untraceability B A Senders unobservable….
Anonymity Properties III:Unlinkability A B Senders and Receivers are observable, but not clear who is talking to whom
Anonymous from Who?(threat model) • The observer: • Can compromise (almost) everything but two users of the system • Observes and modifies all network traffic • Observes all network traffic • Global Passive Adversary • Observes some network traffic • Is the service the user is accessing
Properties • A mix cascade guarantees that a global active attacker cannot distinguish two honest users who send one message each between time t and t’. • e.g. mixing votes • DC-net • (both sender and receiver anonymity) • Can be expressed formally
Anonymity of Existing Implementations Mixes Mix Systems
R - Receiver A - Mix B - Mix M, 0101011 R B A A B R R B R M, 0101011 R B M, 0101011 Mix System Sender Receiver
Doing Things Anonymously • Can provide guarantees for those who wish to send one message < 32K, and suffer the consequences of it not reaching the receiver • Real life is not like that • Anonymous email (Mixmaster, Mixminion) • Send and receive anonymous emails • Web Browsing (JAP, TOR, Tarzan, Morphmix) • Wide file size distribution • Low latency
Anonymity Analysis of Existing Systems • Define a system, and an adversary • Take inputs into the system • e.g. web request message stream • Email interaction • Compute observation Hence figure out how vulnerable the anonymity of a certain activity is to a particular adversary.
R2 Sender 1 M1 R3 Sender 2 M2 Sender3 R1 R2 Sender 1 M1 R3 Sender 2 M2 Sender3 R1 Inputs, Model, Observation System: M1 M2 (transition semantics model of the mixes) R1 Sender 1 Inputs: R2 Sender 2 R3 Sender 3 Observation: Attacker: Global Passive Adversary
B Q R D Mix Network A C Traditionally {A,B,C,D}
Timed Mix A B C D {A,B,C,D}
B Q R D Mix Network A C Traditionally {A,B,C,D} The message arriving to R is much more likely to be from D than from A
M N+M N N Pool Mix • M messages stay in the mix at each round • Messages to be sent are picked from both the N and the M • A message might stay in the mix for an very long time (but the probability of this happening is very small) • The anonymity set of a message leaving at round i includes the senders who sent messages processed during previous rounds
Adding Probabilities • Let us add the probability of that event having occurred to each event • Call this Anonymity Probability Distribution • So {A,B,C,D} could become: • {(A,¼), (B, ¼),(C, ¼),(D, ¼)} • Or, {(A,0.5), (B,0.1),(C,0.1),(D,0.3)} • The probability distribution you come up with will depend on your observation, (+ knowledge, computational power…)
Entropy • Ok, what can we do with the probability distribution afterwards? • From information theory, is the information content of a probability distribution • Can use this for: • Measuring anonymity • Expressing new attacks (ones which do not modify the set, but modify the distribution) • Comparing effectiveness of attacks
Pool Mix Revisited • Could not previously compare a pool mix with a other mixes • Now we can! • Compute the entropy of the geometric distribution • Pool mix with 100 inputs and 10 “feedbacks” is equivalent to a standard mix with 140 inputs(!!!) • But, average delay of a message going through a pool mix is greater • In the above example, 9% chance “of staying for another round”
B A Q C R D Mix Networks • Can also compute the anonymity probability distribution in mix networks • Model and details in [Ser04] {(A,0.125), (B,0.125), (C,0.25), (D,0.5)}
Impact of Low Latency and Repeated Communication -Packet Counting -Intersection
Connection-based Anonymity Systems • A number of nodes • Nodes do not mix, but do onion encryption • Packets are forwarded along links • All packets of a connection are forwarded via the same sequence of nodes “Classical” Network P2P anonymity system
The Packet Counting Attack I • Connection-based Anonymity Systems split the data up into many fairly small packets <1K • All packets of an anonymous connection travel down the same path • Thus, counting the packets may reveal which connections go where • Merely coarse-grained packet counting required
Packet Counting II • Observe the mix for time t and count packets on each link • Correlate incoming and outgoing links • 1075 and 1076 3056 2497 2748 2850 1804 1353 1075 1076 • Ok if: • d (mix delay) << t • t is much smaller than interval between new connections starting
This case may be attackable, we consider it not to be Packet Counting – Key Observation • Packet counting works if the whole connection is lone • i.e. if it is the only connection on all the links (from the client to the server) it passes through
Packet Counting – Results • Hence, we need 2 or more connections on as many links as possible • In our paper (ESORICS 2003) we define this formally • Then simulate, showing that • E.g. 100 nodes, 100 connections via 2-4 nodes 92% of connections are lone (p2p scenario) • E.g. 20 nodes, 200 connections via 2-4 nodes 2.5% of connections lone (classic network)
Repeated Communication M To M Alice Threshold B+1 N B Steves To N As seen by the attacker
The Results (1000 rounds, B=10) P(Estimate) Receivers, r Estimate of probability of Alice sending to r
Conclusions • Anonymity is a security property • not just privacy • Analysis of anonymity properties important • Has been a neglected area • Uses tools from other fields (graph theory, probability) • Plenty of applications • Identity management • Electronic voting • Anonymous email (whistle blowing)