How Much Anonymity does Network Latency Leak?

How Much Anonymity does Network Latency Leak? Paper by: Nicholas Hopper, Eugene Vasserman, Eric Chan-Tin Presented by: Dan Czerniewski October 3, 2011

Goal of every anonymous communication scheme is to allow users to communicate while concealing information about who communicates with whom.

The Idea • Chaum proposed sending messages through a “Mix server” that mixes together messages from several senders before forwarding these messages to their destinations, concealing relationship between sender and receiver. • There are many schemes, yet all rely on “mixing” relays.

High-latency • Like Mixmaster and Mixminion • Deliver messages at a significant delay • Schemes implement countermeasures that increase delay • Pool Mixing • Consume Bandwidth • Cover Traffic

Low-Latency • Like Tor, I2P, AN.ON, Crowds, Anonymizer.com • Commercial proxy aggregators • Allows anonymous use of application services • Remote login and web browsing • Reduced anonymity guaranteed • Security against “local” adversary

Malicious servers acting as local adversaries can observe the network latency of a connection made over a Tor circuit • 3 experiments that measure the extent to which this information leakage compromises the anonymity of clients using a low-latency anonymity scheme • Analysis of noise-free anonymity leakage • A passive linkability attack • An active client-identification attack

Quick overview of Tor • Low-latency, bandwidth-efficient, anonymizing layer for TCP streams • With use of at least 3 nodes, no node knows the identities of both communicating parties

Latency without Noise • Anonymity circuit imposed no delay at all • Difference between connecting to a server normally and over the anonymity service is in the latter, the client’s IP address is missing • Best possible case for an attack based solely on Round-Trip Time (RTT) information. • Analyzed on MIT King Data Set and PlanetLab systems

Circuit Linking via Latency • 2 colluding servers both accept connections from the same exit node • The 2 servers try to determine whether they are communicating with different clients or the same client.

Are they the same? • The servers have to calculate the RTT between each node in the connection • The servers then have to calculate the “queueing” time for each node • Add everything to gether • If everything is equivalent, then it indicates probabilistically they are the same • If not, they come from different distributions

Attack by the server • Sends HTML with 1000 separate tags printing empty images. • Causes browser to make 1000 separate connections to server. • The amount of calls varies, but with a possible 24 concurrent connections, this requires about 42 “rounds” of connections.

2 different tests • Comparison of means • Confidence interval for the mean of each sample population with traversal at fixed time • Kolmogorov-Smirnov • Computes the largest difference in cumulative probability density between two sample sets • Receiver Operating Characteristic Curves • Each point corresponds to the true positive and false positive rates for one setting of the rejection thresholds

Client Location Via Latency • Adversary is three logical entities: Aserver, Aclient and Ator • Goal is to identify V’s network latency

The Attack • Basic Idea • Calculate the RTT between V and E • 3 steps • Measuring first hop latency • Estimating candidate RTTs • Eliminating Candidates

Measuring first hop latency • Aserver and Ator can determine circuit order after several iterations • Aclient connects using the circuit and calculates RTT • With information we can estimate the RTT from V to E

Estimating candidate RTTs • Need a method to obtain the RTT between two hosts without explicit cooperation of either • Vivaldi embedding algorithm • Ease of implementation • Disadvantage: in order to be accurate without cooperation, several nodes must be used for the service

Eliminating Candidates • Check all candidates to see if their RTTs are consistent with the estimated RTT between V and E • Accepting means the RTT is within 85% of the estimate RTT between V and E

Limitations • Limited data on conditional information gain • Client location attack assumes that a user repeatedly accesses a server from the same network location

Mitigation • Onion routing minimizes the success probability of Murdoch-Danezis’ clogging attack • Adding sufficient delays to make the RTT and timing characteristics of Tor servers independent of the underlying network topology

Thank you! • Questions?

How Much Anonymity does Network Latency Leak?