190 likes | 492 Views
Overview of Privacy Preserving Techniques. This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas Focus on problems and the basic ideas. Outline. Privacy problem in computing Major techniques Data perturbation Data anonymization
E N D
This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas • Focus on problems and the basic ideas
Outline • Privacy problem in computing • Major techniques • Data perturbation • Data anonymization • Cryptographic methods • Privacy in different application areas • Data mining • Data publishing • Databases • Data outsourcing • Social network • Mobile computing
Privacy vs. Security • Network security • Assumption: the two parties trust each other, but the communication network is not trusted. Alice Bob Communication channel Encrypting data Decrypting data Bob knows the original data that Alice owns.
Privacy problems • Information about a person or a single party • Parties do not trust each other: curious parties (including malicious insiders) may look at sensitive contents • Parties follow protocols honestly (semi-honest assumption) Alice Bob Deliver “sanitized” data Bob is an untrusted party. He may try to figure out some Private information from the sanitized data
Two categories (1) Transformation based methods a “curious party” Alice Bob Communication channel transformed data Works on the transformed data only Bob does not know the original data.
(2) Cryptographic protocol methods Some protocol using cryptographic primitives Statistical Info/ Intermediate result Info from other parties Party 1 Party 2 Party n data data data
Web model collaboration model Outsourcing model Party 1 Party 2 Party n Web Apps data data data data Computing scenarios user 1 user 1 user 1 Private info Export data to use the service Data owner Service provider data
Issues with data transformation • Techniques performing the transformation • Transformation should preserve important information • How much information loss • How to recover the information from the transformed data • Threat model • Attacks reconstructing the original data from the transformed data • Attacks finding significant additional information • The cost • Transforming data • Recovering the important information
Transformation techniques • Data Perturbation • Additive perturbation • Multiplicative perturbation • Randomized responses • Data Anonymization • k-anonymization • l-diversity • t-closeness • m-invariance
Attacks on transformation techniques • Data reconstruction and noise reduction techniques (on data perturbation) • random matrix theory • spectral analysis • Inference attacks (on data anonymization) • Utilizing background knowledge
Cryptographic approaches Using the following cryptographic primitives • Secure multiparty computation (SMC) • Yao’s millionaire problem • Alice wants to know whether she has more money than Bob • Alice&Bob cannot know the exact number of each other’s money. Alice knows only the result • Oblivious transfer • Bob holds n items. Alice wants to know i-th item. • Bob cannot know i – Alice’s privacy • Alice knows nothing except the i-th item • Homomorphic encryption • Allow computation on encrypted data • E.g., E(X)*E(Y) = E(X+Y)
Characteristics: • Pro: preserving total privacy • Con: expensive, limited # of parties • Applications: for distributed datasets (the corporate model) • Protocols for data mining algorithms • Statistical analysis (matrix, vector computation) • Often discussed in two-party (or a small number of parties) scenarios.
Privacy-preserving data mining • Purpose • Mining the models without leaking the information about individual records • topics • Basic statistics (mean, variance, etc.) • Data classification • Data clustering • Association rule mining • Privacy of mined models
Privacy preserving database applications [Du&Atallah2000] Statistical databases Private information retrieval Outsourced databases
Social Network Privacy • Publishing social network structure • Attacks can be applied to reveal the mapping [163,167] • Characteristics of subgraph • Adversarial background knowledge Anonymization is a popular method
Social network privacy • Privacy settings of SN • Help users set/tune privacy settings • Understand the relationship between privacy and functionalities of SN • They are a pair of conflicting factors
Privacy in Mobile computing • Preserving location privacy • User-defined or system supplied privacy policies [Bamba&Liu2008, Beresford&Stajano2003] • Extending k-anonymity techniques to location cloaking [Gedik&Liu2008, Gruteser&Grunwald2002] • Pseudonymity of user identities – frequently changing internal id. [Beresford&Stajano2003]