170 likes | 200 Views
Probabilistic Privacy Analysis of Published Views. Hui (Wendy) Wang Laks V.S. Lakshmanan University of British Columbia Vancouver, Canada. Motivation. Publishing relational data containing personal information Sensitive information: private associations E.g., Bill gets AIDS
E N D
Probabilistic Privacy Analysis of Published Views Hui (Wendy) Wang Laks V.S. Lakshmanan University of British Columbia Vancouver, Canada Probabilistic Privacy Analysis of Published Views, WPES'06
Motivation • Publishing relational data containing personal information • Sensitive information: private associations • E.g., Bill gets AIDS • The published data • Usage: for data analysis • E.g. find out what are the ages that people are more likely to have heart disease • Privacy concern: hide the private association Probabilistic Privacy Analysis of Published Views, WPES'06
Protection Approach 1 • Generalization of base table (k-anonymity, e.g., [Bayardo05], [LeFevre05]) • The generalized data is USELESS for data analysis! • Revisit the example: what are the ages that people are more likely to have heart disease Probabilistic Privacy Analysis of Published Views, WPES'06
Protection Approach 2 • Publishing Views (E.g., [Yao05], [Deutch05], [Miklau04]) V1 V2 • Private associations may be revealed • E.g., V1 join V2 Prob(“Bill”, “AIDS”) = 1 Probabilistic Privacy Analysis of Published Views, WPES'06
Problem Set Given a view scheme, what’s its probability of leakage of private association? Probabilistic Privacy Analysis of Published Views, WPES'06
Our Contributions • Define two attack models • Propose connectivity graph as the synopsis of the database • Based on connectivity graph, for each attack model, derive the probability of information leakage Probabilistic Privacy Analysis of Published Views, WPES'06
Security Model • Private association • Form: (ID=I, P=p) • E.g., (Name=“Bill”, Disease=“HIV”) • Can be expressed in SQL • Assumption • For every private association, every ID value is associated with one unique p value in the base table Probabilistic Privacy Analysis of Published Views, WPES'06
Attack Model 1: Unrestricted Model • The attacker has no background knowledge • The attacker can access to the view def. and the view tables • The attack approach • Construct the candidates of base table • Pick the ones that contain the private association Probabilistic Privacy Analysis of Published Views, WPES'06
Example of Unrestricted Model Attacker knows: V1= A, B (T) V2 = B, c (T) Base table T Attacker constructs: There are 7 such possible worlds ... Possible world #1 Possible world #2 Possible world #3 There are 5 such interesting worlds For (A=a1, C=c1), attacker picks: √ X √ Prob. of privacy breach of (A=a1, C=c1): 5/7 Probabilistic Privacy Analysis of Published Views, WPES'06
Attack Model 2: Restricted Model • The attacker knows the assumption that for every private association, every ID value is associated with one unique p value in the base table • Similar approach • Construct the candidates of base table, s.t., they meet the assumption • Pick the ones that contain the private association Probabilistic Privacy Analysis of Published Views, WPES'06
Example of Restricted Model Attacker knows: V1= A, B (T) V2 = B, c (T) Base table T For (A=a1, C=c1), Attacker constructs Possible world #1 Possible world #2 Possible world #3 Possible world #4 √ √ Attacker picks: X X Prob. of privacy breach of (A=a1, C=c1): 1/2 Probabilistic Privacy Analysis of Published Views, WPES'06
A Further Step • Question • Given a view scheme and two view tables, how to efficiently calculate the probability? • Our contributions • For each attack model, we derived the formulas to quantify the probability • Details can be found in the paper Probabilistic Privacy Analysis of Published Views, WPES'06
Conclusion • We defined a general framework to measure the likelihood of privacy breach • We proposed two attack models • For each model, we derived the formulas to calculate the probability of privacy breach Probabilistic Privacy Analysis of Published Views, WPES'06
Future Work • For the formulas of calculation of the breach probability, find an appropriate approximation • Extend the work to k-view-table case, where k>2 Probabilistic Privacy Analysis of Published Views, WPES'06
Q & A Probabilistic Privacy Analysis of Published Views, WPES'06
More Slides Probabilistic Privacy Analysis of Published Views, WPES'06
Example of Probability Calculation View scheme V1= A, B (T) V2 = B, c (T) Base table T 1 3 • Unrestricted model V.S. unrestricted cover • Example of unrestricted cover • Restricted model V.S. restricted cover • Example of restricted cover a1, b1 b1, c1 2 a2, b1 b1, c2 4 Connectivity graph Probabilistic Privacy Analysis of Published Views, WPES'06