280 likes | 426 Views
Csci 8735: Advanced DBMS Spring 2011. Paper presentation by Vivek. Privacy-Preserving Data Mining on Moving Object Trajectories Gyözö Gidófalvi , Xuegang Huang, Torben Bach Pedersen MDM 2007: 60-68. Previously in this course…. “ Personalized Web search with location preferences”
E N D
Csci 8735: Advanced DBMSSpring 2011 Paper presentation by Vivek
Privacy-Preserving Data Mining on Moving Object Trajectories GyözöGidófalvi, Xuegang Huang, Torben Bach Pedersen MDM 2007: 60-68
Previously in this course… “Personalized Web search with location preferences” “Location recommendation for location-based social networks” “GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory”
Obvious reason: Potential threats (based on history, ability to find home address) Facebook privacy concerns
Motivation Privacy has not always been addressed when dealing with moving object trajectories “Challenge of obtaining detailed, accurate patterns from anonymized location and trajectory data”
What is this paper all about? How can we collect trajectory data without compromising user’s privacy while still allowing effective data mining?
Contributions • Anonymization model for preserving location privacy • Grid-based framework for data collection and mining • Client-server architecture that implements above • Techniques for solving dense spatio-temporal areas and finding frequent routes (classic data mining tasks)
Existing privacy protection Trusted middleware (anonymizer), encloses query location in a “cloaking” rectangle that includes location of k-1 other users (k-anonymity)
Spatio-Temporal Anonymization • “Anonymize the trajectory by reducing the spatio-temporal resolution of the 2D space” • Anonymization rectangle satisfying (areasize, maxLocProb) is (R, ts, te) • Can we enclose the whole trajectory in one rectangle? • Proposal: Provide anonymized trajectory by cutting it into pieces and enclosing each in R
Practical “Cut-Enclose” implementation • Split the whole trajectory into set of polylines • Time delay factor
Non deterministic way of constructing anonymization rectangles will lead to loss of privacy!
Grid-based solution • Deterministic way to anonymize location, avoiding “overlapping” scenario • Build the rectangle based on a single, predefined 2D grid
Grid-based solution continued… (a,t1), (b,t2)…(h,t7) => (p4,t1,t3), (p5,t4,t5), (p2,t6,t7)
Common Regular Partitioning • Individual Regular Partitioning • Individual Irregular Partitioning
Finding dense spatio-temporal areas Dense ST-area query D = {ci: ci.count >=min_count ^ ci.prob >=min_prob}
We now have a way to anonymize trajectory data! We also know how to mine data!
What is the cost of modifying existing data mining algorithms to work with anonymized data instead of “actual” data?
References and other papers • Protecting moving trajectories with dummies • Generalized based approach towards trajectory anonymization • Uncertainty-aware path cloaking algorithm • Privacy in Location-based Services: A System Architecture Perspective