110 likes | 213 Views
The Database Tracker Problem. EECS710: Information Security and Assurance Professor H Saiedian From: Denning, et al “The Tracker: A Threat to Statistical Database Security” ACM TODBS , 1978. A statistical database. Construction of a characteristic formula C
E N D
The Database Tracker Problem EECS710: Information Security and Assurance Professor H Saiedian From: Denning, et al “The Tracker: A Threat to Statistical Database Security” ACM TODBS, 1978
A statistical database • Construction of a characteristic formula C • A logical formula, operators: AND, OR, NOT (~) • Common queries • count (C) • sum (C; j) • Examples • count (M AND CS) = 3 short for count (Sex=‘M’ AND Dept=‘CS’) • sum (M OR ~CS; Salary) = $176K • sum (salary <= 15K; Contributions) = $180
Compormise • When confidential info is deduced • Positive: deduce a value • Negative: learn that a value is not in a given field (e.g., Baker did not contribute $200) • Secure: no compromise is possible • Example: a person knows that Dodd is a female CS professor • count (F AND CS AND Prof) = 1 • count (F AND CS AND Prof AND Salary <= 15K) = 1 • If count = 0, Dodd’s salary is not <= $15K
Setting a lower bound? • Setting a lower bound value helps but not always We know count (~C) = n – count (C) • Ask a tautology count (Prof OR ~Prof) = 12 count (~(F AND CS AND Prof)) = 11 12-11 = 1 female prof sum (Prof OR Prof; Salary) = $194K sum (~(F AND CS AND Prof; Salary)) = $179K Dodd’s salary = $194 - $179 = $15K
Need an upper bound also • Respond to query (C) if k <= count (C) <= n - k reject otherwise • Note: k <= n/2 (otherwise all queries will be unanswerable)
What value for k? • If a questioner knows (from external sources) that individual I is uniquely characterized by C, then the questioner will seek whether I has characteristicα • Assume k =2 • Because count(C AND α) <= count (C) = 1 < k questioner cannot use the above example • Questioner may divide C into two parts to calculate count (C AND α)
The database tracker • How? Divide C into C = C1 AND C2 such that count (C1 AND ~C2) and count (C1) are answerable • T = C1 AND ~C2 is called a tracker of I • it tracks down additional characteristics of I
Calculating the tracker • count (C) = count (C1) – count (T) • count (C AND α) = count (T OR C1 AND α) – count (T) • If count (C AND α) = 0 negative compromise • If count (C AND α) = count (C) positive compromise (I has α) • If count (C) = 1 arbitrary stats about I can be computed from query (C) = query (C1) – query (T)
A tracker example • Suppose k = 2 • Query (C) is answerable if 2 <= count (C) <= 10 • Questioner believes C = F AND CS AND Prof is Dodd • Constructs T = C1 AND ~C2 where C1 = “F” C2 = “CS AND Prof”
To verify the tracker count (F AND CS AND Prof) = count (F) – count (F AND ~(CS AND Prof)) = 5 – 4 = 1 To find Dodd’s salary, apply query (c) = query (A) – query (T) sum (F AND CS AND Prof; salary) = sum (F; Salary) – sum (F AND ~(CS AND Prof); salary)= $90K - $75K = $15K
Negative compromise also possible count (F AND CS AND Prof AND Salary > $15K) = count (F AND ~(CS AND Prof) OR F AND Salary > $15K) – count (F AND CS AND Prof) = 4 – 4 = 0