1 / 33

Privacy Vulnerability of Published Anonymous Mobility Traces

Privacy Vulnerability of Published Anonymous Mobility Traces. Chris Y. T. Ma, David K. Y. Yau , Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak Ridge National Laboratory ). Motivation: Collecting mobility traces. Mobile network applications

mada
Download Presentation

Privacy Vulnerability of Published Anonymous Mobility Traces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Privacy Vulnerability ofPublished Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao(Oak Ridge National Laboratory)

  2. Motivation:Collecting mobility traces • Mobile network applications • traffic monitoring, road surface sensing, radiation and chemical detection • Mobility traces are collected and published to assist the design, analysis, and evaluation of mobile networks • E.g., Crawdad

  3. <11:32:12, Chris Ma, (41.89840,-87.61999)> <11:30~11:35, ID-271, (41.89~41.90,-87.62~-87.61)> Motivation:Privacy vulnerability • Measures are carried out to protect privacy of the participants • Traces are identified using a random but consistent and unique identifier that is not correlated to the real ID • Spatial and temporal granularities are reduced

  4. Motivation:Privacy vulnerability • These measures are not enough! • Participants can be openly observed • Participants may leak their location information (snapshots of time and location pairs, termed as side information) • web blogs, status in social networks, tweets, causal conversations, etc. • An adversary, who tries to identify the complete trace (movement history) of one or more participants, may succeed with high probability

  5. Our contributions • Comprehensive study of attack strategies • Various ways for side information collection • Analytically proved the optimality of attack strategy • Quantitative simulation results • Privacy implications of characteristics of real traces and synthetic traces • Synthetic nodes are more sparsely placed • More easily identified but more difficult to meet with

  6. Agenda • Problem formulation • Analytical derivation • Experimental analysis • Conclusion

  7. Problem formulation- trace sampling and publication <t, R.B., (x,y)> <t’, IDi, (x’,y’)>

  8. Problem formulation • An adversary tries to identify the complete movement history of the participant(s) • collects side information and compares with the published traces • Possible attack scenarios • Adversary infers the location of a victim indirectly (passive adversary) • Adversary observes the movement of the victims physically (active adversary)

  9. Passive Adversary- infers snapshots of victim Special case:reference times are sampling times

  10. Passive Adversary- infers snapshots of victim General case:reference times are not sampling times

  11. Passive Adversary- infers snapshots of victim General case:reference times are not sampling times Infers the possible location of the node at reference times using a general mobility model - preference of the nodes, physical constraints

  12. Passive Adversary- infers snapshots of victim General case:reference times are not sampling times Infers the possible location of the node at reference times using a general mobility model

  13. Passive Adversary- infers snapshots of victim General case:reference times are not sampling times

  14. Attack approaches of passive adversary • Use of Bayesian approach to determine the trace that gives the best match with the inferred location information Published traces Noisy side information

  15. Attack approaches of passive adversary • For the special case (reference time = sampling time), with the assumption that noise is i.i.d., • For the general case, with the assumptions that noise is i.i.d. and movement is Markovian,

  16. Attack approaches of passive adversary • Most Likelihood Estimator (MLE) approach • Minimum Square (MSQ) approach • Basic (BAS) approach • Weighted Exponential (EXP) approach • When noise is Gaussian, MLE and MSQ are equivalent

  17. Active Adversary- observes victims physically Adversary is one of the participants

  18. Active Adversary- observes victims physically Adversary stays at a (popular) position

  19. Active Adversary- observes victims physically Adversary travels between popular locations

  20. Problem formulation • Why the two different cases? • Active • Needs to consider how to collect the side information physically as time evolves • Adversary tries to identify as many victims as possible – plot of k-anonymity as function of time • Passive • Snapshots of victim are inferred (not collected) and less accurate in general • Adversary tries to identify one victim only – plot of correctness as function of pieces of side information

  21. Attack strategy of active adversary • Algorithm of the attack (in action) real ID trace IDs 1 A, B, C2 A, B, C3 A, B, C 1 A, B, C2 A, B, C3 A, B, C 1 A, B2 A, B3 A, B, C t2 t1 1 2 3

  22. Experimental analysis • Basic information • Real traces • 536 San Francisco taxicabs • 2348 Shanghai Grid buses • Synthetic traces • Using map size and average speed computed from taxi cab traces • Random waypoint (with different maximum trip lengths) • Random walk • Spatial granularity = 1 km • Temporal granularity = 1 minute (unless stated otherwise)

  23. Characteristics of the tracesDistance between traces • Real traces are closer to each other on average • Bus traces have a broader range • For synthetic traces, the shorter the trip length, the further away they are from each other in general

  24. Significant observations • Lack of preferred locations and random initial location of the synthetic traces • Nodes are more sparsely distributed in the network • Implications: • For adversary in general • Can easily identify the trace of a synthetic node since no other traces share similar path • For active adversary • May take longer time to meet with each synthetic node

  25. Attack performancePassive adversary (special case) • Special case - side-information inferred at sampling times of traces • Correct assumption of noise (Gaussian ) • Cab traces • Observations • MLE, MSQ perform equally well • BAS gives the least amount of wrong conclusions initially

  26. Attack performancePassive adversary (special case) • Random waypoint traces • Most efficient attack • traces have very different paths

  27. Attack performancePassive adversary (special case) • Incorrect assumption of noise • Assumption: Uniform • Actual: Gaussian • Cab traces • Observations • MLE is much worsened

  28. Attack performancePassive adversary (general case) • General case – side information at times different from trace sampling times • Worst case scenario – all times are different • Infer the location of the victim using the mobility model • Gaussian noise (no noise as best performance bound) • Cab traces

  29. SummaryPassive adversary • For passive adversary • MLE and MSQ give the best performance among the four approaches in terms of the fraction of correct conclusions • Since MLE relies on the knowledge of type of noise and its magnitude, MSQ is the preferred more robust attack approach

  30. Attack performanceActive adversary as one of mobile nodes • Higher attack efficiency for real traces • Mobile nodes more likely to visit the same set of locations at the same time • Synthetic nodes more sparsely distributed in the network 1 time step = 1 minute

  31. Attack performanceActive adversary who stays at one of the cells • Observations • Comparing real traces and synthetic traces • Attacks on real traces are more efficient – k-anonymity drops more quickly • Popular cells in real traces and random waypoint traces are more aggregated together • Being at a popular cell does not necessarily results in higher attack efficiency cabs buses Random walk Random waypoint

  32. Attack performanceActive adversary who moves among popular cells • The ability to move among popular cells improve attack efficiency • Improvement is more significant if node movements are more localized • Visiting more cells does not necessarily improves efficiency cabs buses Random walk Random waypoint

  33. Conclusion • Study how privacy leaks through trace publication • Under different adversary strategies to collect side information • Using different mobile traces with different characteristics • Experimentally show that the adversary is able to identify the trace of a victim from the published set with high probability

More Related