Quantifying Location Privacy

Quantifying Location Privacy Reza Shokri George Theodorakopoulos Jean-Yves Le Boudec Jean-Pierre Hubaux May 2011

A location trace is not only a set of positions on a map The contextual information attached to a trace tells much about our habits, interests, activities, and relationships

envisioningdevelopment.net/map

Distort location information beforeexposing it to others Location-Privacy Protection

Location-Privacy Protection • Anonymization (pseudonymization) • Replacing actual username with a random identity • Location Obfuscation • Hiding location, Adding noise, Reducing precision A common formal framework is MISSING How to evaluate/compare various protection mechanisms? Which metric to use? original low accuracy low precision Pictures from Krumm 2007

Location Privacy:A Probabilistic Framework

Location-Privacy Preserving Mechanism Attacker Knowledge Construction Past Traces(vectors of noisy/missing events) uN u1 Actual Traces(vectors of actual events) Users’ Mobility Profiles MC Transition Matrices Users … u1 uN u2 KC u1 rj Observed Traces(vectors of observed events) Nyms LPPM … 1 uN 2 Pij ri Timeline: … T 1 2 3 4 N Timeline: T 1 2 3 4 Attack Obfuscation Anonymization Reconstructed Traces

Alice Alice Alice Alice Alice Alice Alice Alice Alice Location-Privacy Preserving Mechanism Location-Obfuscation Function: Hiding, Reducing Precision, Adding Noise, Location Generalization,… LPPM Alice A Probabilistic Mapping of a Location to a Set of Locations

Location-Privacy Preserving Mechanism Anonymization Function: Replace Real Usernames with Random Pseudonyms (e.g., integer 1…N) Bob 1 LPPM Alice 3 Charlie 2 A Random Permutation of Usernames

Actual trace of user u Observed trace of user u, with pseudonym u’ Location-Privacy Preserving Mechanism Anonymization Location Obfuscation (for user u) Spatiotemporal Event: <Who, When, Where>

Anonymized and Obfuscated Traces Users’ mobility profiles LPPM PDFanonymization PDFobfuscation Adversary Model Observation Knowledge

Users’ Profiles MC Transition Matrices Past Traces (vectors of noisy/missing past events) uN uN rj u1 u1 KC … Pij ri Learning Users’ Mobility Profiles((adversary knowledge construction)) From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities Pu

Alice Alice Mobility Profile for Example – Simple Knowledge Construction Prior Knowledge for (this example: 100 Training Traces) How to consider noisy/partial traces? e.g., knowing only the user’s location in the morning (her workplace), and her location in the evening (her home)

Users’ Profiles MC Transition Matrices Past Traces (vectors of noisy/missing past events) uN uN rj u1 u1 KC … Pij ri Learning Users’ Mobility Profiles((adversary knowledge construction)) From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities Pu Our Solution: Using Monte-Carlo method: Gibbs Sampling to estimate the probability distribution of the users’ mobility profiles

Anonymized and Obfuscated Traces Users’ mobility profiles LPPM PDFanonymization PDFobfuscation Inference Attack Examples Localization Attack: “Where was Alice at 8pm?” What is the probability distribution over the locations for user ‘Alice’ at time ‘8pm’? Tracking Attack: “Where did Alice go yesterday?” What is the most probable trace (trajectory) for user ‘Alice’ for time period ‘yesterday’? Meeting Disclosure Attack: “How many times did Alice and Bob meet?” Aggregate Presence Disclosure: “How many users were present at restaurant x, at 9pm?” Adversary Model Observation Knowledge

Inference Attacks Computationally infeasible:  (anonymization permutation) can take N! values Our Solution: Decoupling De-anonymization from De-obfuscation

Users Nyms u1 1 u2 2 … … uN N De-anonymization 1 - Compute the likelihood of observing trace ‘i’ from user ‘u’, for all ‘i’ and ‘u’, using HMP: Forward-Backward algorithm. O(R2N2T) 2 - Compute the most likely assignment using a Maximum Weight Assignment algorithm (e.g., Hungarian algorithm). O(N4)

De-obfuscation Localization Attack Given the most likely assignment *, the localization probability can be computed using Hidden Markov Model: the Forward-Backward algorithm.O(R2T) Tracking Attack Given the most likely assignment *, the most likely trace for each user can be computed using Viterbi algorithm .O(R2T)

Location-Privacy Metric

How accurate is the estimate? Confidence level and confidence interval How focused is the estimate on a single value? The Entropy of the estimated random variable How close is the estimate to the true value (the real outcome)? Assessment of Inference Attacks In an inference attack, the adversary estimates the true value of some random variable ‘X’ (e.g., location of a user at a given time instant) Let xc (unknown to the adversary) be the actual value of X Three properties of the estimation’s performance:

Location-Privacy Metric The true outcome of a random variable is what users want to hide from the adversary Hence, incorrectness of the adversary’s inference attack is the metric that defines the privacy of users Location-Privacy of user ‘u’ at time ‘t’ with respect to the localization attack = Incorrectness of the adversary (the expected estimation error):

Location-Privacy Meter A Tool to Quantify Location Privacy http://lca.epfl.ch/projects/quantifyingprivacy

Location-Privacy Meter (LPM) • You provide the tool with • Some traces to learn the users’ mobility profiles • The PDF associated with the protection mechanism • Some traces to run the tool on • LPM provides you with • Location privacy of users with respect to various attacks: Localization, Tracking, Meeting Disclosure, Aggregate Presence Disclosure,…

LPM: An Example CRAWDAD dataset • N = 20 users • R = 40 regions • T = 96 time instants • Protection mechanism: • Anonymization • Location Obfuscation • Hiding location • Precision reduction (dropping low-order bits from the x, y coordinates of the location)

LPM: Results – Localization Attack No obfuscation

Assessment of other Metrics K-anonymity Entropy

Conclusion • A unified formal framework to describe and evaluate a variety of location-privacy preserving mechanisms with respect to various inference attacks • Modeling LPPM evaluationas an estimation problem • Throw attacks at the LPPM • The right Metric: Expected Estimation Error • An object-oriented tool (Location-Privacy Meter) to evaluate/compare location-privacy preserving mechanisms http://people.epfl.ch/reza.shokri

Hidden Markov Model Alice PLPPM(6{6,7,8}) PAlice(11) PAlice(116) PAlice(614) 11 6 14 18 12 7 15 19 13 8 16 20

Quantifying Location Privacy