330 likes | 340 Views
Understanding Tor Usage with Privacy-Preserving Measurement. T Wilson-Brown ∗ UNSW Canberra Cyber University of New South Wales. Rob Jansen U.S. Naval Research Laboratory. Akshaya Mani ∗ Georgetown University. Aaron Johnson U.S. Naval Research Laboratory. Micah Sherr
E N D
Understanding Tor Usage with Privacy-Preserving Measurement T Wilson-Brown∗ UNSW Canberra Cyber University of New South Wales Rob Jansen U.S. Naval Research Laboratory Akshaya Mani∗ Georgetown University Aaron Johnson U.S. Naval Research Laboratory Micah Sherr Georgetown University *Co-first authors
Tor is an Anonymity Network . . . . . .
Whouses Tor? and How do they use it?
Challenges in Measuring Tor Naïve Solution + Gathering statistics poses privacy risks Machine compromise Compulsion through subpoena Published aggregate + Background knowledge . . . +
Differential Privacy Aggregate + Noise Safe Tor Measurements with Differential Privacy Minimizes and quantifies the privacy risk Provides good accuracy PrivEx[Elahi et al. CCS’14], PrivCount[Jansen et al. CCS’ 16], HisTorε[Mani et al. NDSS’17], and PSC [Fenske et al. CCS’17] PrivCount PSC
Aggregate + Noise Primer on PrivCountand PSC Relays record statistics in encrypted counters . . . Relays Aggregation Parties perform crypto operations Output satisfies (ɛ, 𝛿)-differential privacy guarantees . . . Proved secure in UC-framework Aggregation Parties
PrivCount Queries . . . Aggregation Parties Relays Supports counting queries . . . E.g., how many visits over Tor to Google, Amazon, and Facebook? Does not supports count distinct queries E.g., how many unique destinations visited over Tor?
PSC Queries {I2} . . . |{I1}∪{I2}∪ . . . {In}| Private Set-union Cardinality {I1} {In} Aggregation Parties Relays Supports count of distinct values across relays . . . E.g., how many unique clients connected to Tor?
Safely Measuring Tor with PrivCount and PSC PrivCount Differentially Private privacy parameters |{I1}∪{I2}∪ . . . {In}| PSC . . . Differentially Private privacy parameters
Tuning Privacy via Action Bounds Proposed by Jansen et al. [CCS ’16] For an epoch (i.e, measurement period) Bounds the amount of network activity protected by differential privacy Major Challenge Coming up with “reasonable” bounds that produce accurate results
Aggregate + Noise Differential Privacy Protecting “hypothetical” network-level users produces inaccurate results
Coming Up With Reasonable Action Bounds Consider reasonable activities a Tor user might perform in an epoch E.g., web browsing with Tor Browser Determine how this activity translates to observable actions in Tor Connecting to a domain Send or receive entry/exit data Compute the maximum amount of network action For reasonable amount of this activity For an epoch (we use 24 hours)
Example: Web Browsing with Tor Browser What is the maximum number of domains that a regular user might access over Tor in 24 hours? Action Bound: 20 domains per day – allows for 2-4 domains for 5-10 hours
Aggregate + Noise Deployment: PrivCount & PSC 3 Aggregation Parties 6 Exits & 10 Non-exits . . . . . . . . . US CA FR 3 Operators
Measuring How Users use Tor (via PrivCount) Measurement Period: 4th – 5th Jan 2018 Exit Weight: 1.5% total available exit weight in Tor Web Server ACM IMC ’18 TCP connections Streams Guard Exit Uses new circuit for each unique domain in the address bar (or a new tab) Destination ports requested are web ports (either 80 or 443) Result 1: Vast majority of Tor use is for web browsing
Domain (Alexa Rank) Measurement (via PrivCount) Measurement Period: 31st Jan – 1st Feb 2018 Exit Weight: 2.2% onionoo.torproject.org – 43.4% 47.8 Android Tor client (Orbot) does an onionoo lookup for every relay in every circuit built by Tor
Domain (Alexa Rank) Measurement (via PrivCount) Measurement Period: 31st Jan – 1st Feb 2018 Exit Weight: 2.2% onionoo.torproject.org – 43.4% 47.8 Android Tor client (Orbot) does an onionoo lookup for every relay in every circuit built by Tor Alexa top sites ~80% Result 2: Alexa top sites is a reasonable representation of destinations visited by Tor users
Summary: How Users use Tor Result 1: Vast majority of Tor use is for web browsing Result 2: Alexa top sites is a reasonable representation of destinations visited by Tor users Other Results Top Level Domains (via PrivCount) Result 3: The three main TLDs (.com, .org, and .net) make up the majority of the primary domains accessed by Tor users Unique SLDs & Alexa SLDs (via PSC) Result 4: A long tail exists in the distribution of sites accessed over Tor
PSC Measurements Statistical Analysis Major Challenge E.g., domains visited follow power-law distribution Extrapolating unique counts to the entire network [Krashakov et al. 2006, Adamic et al. 2012] Using information about frequency distribution of observed items log(P(k)) Determine parameter and construct confidence intervals log(k) Using Monte-Carlo simulations for complicated distributions
Measuring Distinct Tor Users (via PSC) Measurement Period: 12th – 15th Apr 2018 Perform measurements using relays of different sizes 0.42 0.88 If suppose clients connect to a single guard: 0.0088 x 148,174 / 0.0042 ≈ 310,460 unique clients > 269, 795 Conclusion: Client IPs connect to multiple guards
Tor Client Model Captures behavior of Tor bridges, etc. Using simulation Model: A set of ppromiscuous clients connect to all guards Remaining clients connect to g guards Yields ~8 million clients Result 1: Tor has approximately 8 million daily users A factor of four more than that reported by Tor Metrics Portal (using heuristics)
Geopolitical Distribution of Tor Clients (via PrivCount) Result 2: United States (US), Russia (RU), and Germany (DE) use Tor the most
Geopolitical Distribution of Tor Clients (via PrivCount) However Tor Metrics Portal ranks United Arab Emirates (AE) second Seems overestimate: implies ~4% of AE population uses Tor daily Possibly majority of Tor clients from AE are partially blocked from using Tor
Network Diversity of Tor Clients (via PrivCount) Uses IPv4 and IPv6 datasets from CAIDA Major Challenge Per-AS differentially- private noise was large for some ASes Hosting providers – E.g., Hetzner, DigitalOcean Probably from Tor bridges, onion services, Tor network scanners
Client Results Result 1: Tor has approximately 8 million daily users Result 2: United States (US), Russia (RU), and Germany (DE) use Tor the most Other Results Client IP churn (via PSC) – has never been measured before Result 3: Client IP churn rate decreases the longer we observe Unique country and AS count (via PSC)
Very Brief Overview of Onion Services Allows user to offer a service without revealing its location (IP address) abc.onion Alice Bob abc.onion Distributed Hash Table [DHT] . . .
Onion Services Results Some results: Result 1: 90.9% (out of 134 million) onion service descriptor fetches failed Result 2: Ahmia onion service search engine contains 56.8% (out of 12.2 million) successfully fetched descriptors
Challenges Faced Coming up with reasonable action bounds Extrapolating PSC measurements to network-wide counts 24 hours waiting time between different measurements Differentially private noise can overwhelm the actual count Requires repeating measurement for multiple rounds
Major Findings Tor is predominantly used for web browsing Tor users visit Alexa sites as regular Internet users do Tor has approximately 8 million daily users Tor has a decreasing client IP churn rate Decreases the longer we observe 90.9% onion address fetches fail Data available at https://security.cs.georgetown.edu/measurement-study/ Understanding Tor Usage with Privacy-Preserving Measurement AkshayaMani, T Wilson-Brown, Rob Jansen, Aaron Johnson, Micah Sherr
Major Findings Tor is predominantly used for web browsing Tor users visit Alexa sites as regular Internet users do Tor has approximately 8 million daily users Tor has a decreasing client IP churn rate Decreases the longer we observe 90.9% onion address fetches fail Data available at https://security.cs.georgetown.edu/measurement-study/ Understanding Tor Usage with Privacy-Preserving Measurement AkshayaMani, T Wilson-Brown, Rob Jansen, Aaron Johnson, Micah Sherr