160 likes | 175 Views
Privacy Issues in Disclosing Averages. Susmit Sarkar (CMU). Non-Interference. Non-Interference : Observable actions of programs are not influenced by sensitive data Too restrictive in practice! Think of password security. Safe Relaxation of Non-Interference. Passwords are sensitive data
E N D
Privacy Issues in Disclosing Averages Susmit Sarkar (CMU)
Non-Interference • Non-Interference : Observable actions of programs are not influenced by sensitive data • Too restrictive in practice! • Think of password security
Safe Relaxation of Non-Interference • Passwords are sensitive data • Checking passwords violates non-interference • This is still okay [Volpano] if passwords are chosen randomly • The interaction is carefully controlled
Generalizing to Averages • Idea: restrict access to allow us to answer interesting queries • Also, we can measure information loss • We want to calculate averages on private data • Generalize the notion of averages
Content Host’s problem • Content host serving multiple content providers • The number of hits is sensitive information • Often, clients ask average hits of specified clients
Example: Sport Site • You want to know how the redesign of your sports portal worked • Complications : It happens to be Superbowl Sunday • We want averages of all sports sites • What if there are only 2 sports sites?
Formal Model Data Query := d1 + d3 + d5 = ? Problem : what about 1 0 1 1 0, and 1 0 1 1 1
Query Model • Solution : Maintain history • Idea : add current query to set, decide if “bad” vectors are derivable • We restrict attention to weighted sums
Issues Ignored in Model • Answers of queries (Right Hand Sides) • Data values • Extraneous information : Correlation between data • Some of this are in further work
Characterizing Bad Vectors • (0 1 0 0 0 0 0 0 0 0 0) • (1 1061 1 1 1 1 1 1 1 1) • We want a measure that indicates when all entries are of similar magnitude
Idea : Entropy • We use the entropy function : -å pi lg pi • Normalize entries so that magnitudes sum to one • Then treat the magnitudes as probabilities in entropy definition • Entropy is low when data is skewed
Formal Problem Statement • m Query vectors Qi = (qi1,qi2,L,qin) • Unknown linear combination U = c1 Q1 + c2 Q2 + L Variables ui = å cj qij • Variables u’i¸ ui and u’i¸ – ui u’i¸ |ui|
Calculating Entropy • Entropyå (u’i / å u’j ) lg (u’i / å u’j) ¸ T • Minimize : å u’I • Notice that this is a convex program
Convex Programming • [Vempala] allows us to do convex programming efficiently • His algorithm allows us to solve our problem in polynomial time
Future Work • Extend our measure to take into account the Right Hand Sides • Change the model to maximize queries we can answer
Bibliography • [Volpano] “Verifying Secrets and Relative Secrecy”, Volpano and Smith, POPL’ 00 • [Vempala] “Solving Convex Programs by Random Walks”, Vempala and Bertsimas, STOC’ 02