670 likes | 681 Views
Location Privacy Protection based on Differential Privacy Strategy for Big Data in Industrial Internet-of-Things. Published in: IEEE Transactions on Industrial Informatics Henrique Potter. Overview. Privacy risks in IoT Privacy protection techniques k-anonymity Differential Privacy
E N D
Location Privacy Protection based on Differential Privacy Strategy for Big Data in Industrial Internet-of-Things Published in: IEEE Transactions on Industrial Informatics Henrique Potter
Overview • Privacy risks in IoT • Privacy protection techniques • k-anonymity • Differential Privacy • How to protect
Privacy risks in IoT • Unauthorized access to private data • Data stored in a remote storage • Personal Devices
Privacy risks in IoT • Unauthorized access to private data • Data stored in a remote storage • Personal Devices • Infer information based on device/user profiling, messaging patterns and public data • Statistical and Machine Learning techniques
Privacy risks in IoT • Privacy leaks • From the Netflix Prize competition • Released 100M ratings of 480K users over 18K movies • Claimed to have anonymized the data
Privacy risks in IoT • Privacy leaks • From the Netflix Prize competition • Released 100M ratings of 480K users over 18K movies • Claimed to have anonymized the data • 96% of users could be uniquely identified when crossing the data against IMDB data (Narayanan & Shmatikov 2006)
Privacy risks in IoT • How to protect privacy • Unauthorized access to private data • Infer information based on device/user profiling, messaging patterns and public data
Differential Privacy • Developed by Cynthia Dwork in 2006 • Formal definition of privacy • Offers a framework to develop privacy solutions • Constrained to aggregate data analysis
Differential Privacy • Developed by Cynthia Dwork in 2006 • Formal definition of privacy • Offers a framework to develop privacy solutions • Constrained to aggregate data analysis • Averages • Profiling techniques • Machine Learning models etc.
Differential Privacy • Developed by Cynthia Dwork in 2006 • Formal definition of privacy • Offers a framework to develop privacy solutions • Constrained to aggregate data analysis • Assumes that the attacker has maximum auxiliary information about the target
Differential Privacy - Scenario Example • Database to compute the avg income of residents
Differential Privacy - Scenario Example • Database to compute the avg income of residents • If you knew that Bob is going to move
Differential Privacy - Scenario Example • Database to compute the avg income of residents • If you knew that Bob is going to move • Execute the algorithm A to compute the average before and after he moves D = database state with Bob record D’ = database state without Bob record
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ x ’
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ
Differential Privacy • Adds a random noise to the answer of A • Make the database D indistinguishable from D’ by a factor of Ꜫ
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private y A(D) =
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private y A(D) =
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private y A(D) = A(D’) =
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private Add a random noise n based on an uniform distribution y A(D) = A(D’) =
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private Add a random noise based on an uniform distribution A(D) Range of outputs
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private A(D) A(D’)
Differential Privacy • For D and D’ that differs in at most in element (sample) • The proportion of the probability of the outputs of P(D) and P(D’) • – differentially private A(D) – differentially private A(D’)
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ – differentially
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ – differentially
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ smaller gets? – differentially
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ smaller gets? – differentially
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ smaller gets? – differentially Less reliable the aggregate information becomes
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ bigger gets? – differentially Less reliable the aggregate information becomes
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ bigger gets? – differentially More reliable the aggregate information becomes
Differential Privacy • For all choices of D,D’ and S of an Attacker • He can’t tell the difference from D and D’ bigger gets? – differentially More reliable the aggregate information becomes Less privacy you have
Differential Privacy • How to choose an acceptable ? • Depends on the application
Differential Privacy • How to choose an acceptable ? • Depends on the application • The base line depends on the sensitivity function
Differential Privacy - Sensitivity • Sensitivity captures the maximum variation in the output of P(D) given that the value that makes the most “impact” is different in D’
Differential Privacy - Theorem • If you add a random Laplacian noise with “width” lambda of to a function P(D). “It will enjoy e - differential privacy” Add a random noise P(D)=y+ Lap( )
Differential Privacy - Mechanisms • Laplacian Mechanism • Adding Laplacian noise bigger then the sensitivity
Differential Privacy - Mechanisms • Laplacian Mechanism • Adding Laplacian noise bigger then the sensitivity • Exponential Mechanism • Randomly selects elements to participate in the aggregate analysis
LPT-DP-K Algorithm • Designed for location data • Adds noise to proportional to most frequently visited locations • Can’t add noise to all data since they defining the position of something
Location privacy tree Accessing count Number Location Information
Weighted Selection • Select K records randomly weighted by their accessing frequency
Weighted Selection • Select K records randomly weighted by their accessing frequency
Weighted Selection • Select K records randomly weighted by their accessing frequency
Noise Enhancement based on Laplace • Adds noise to the K selected records y
Noise Enhancement based on Laplace • Adds noise to the K selected records y n as the random Laplacian noise