290 likes | 579 Views
Hippocratic Databases. Paper by Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu CS 681 Presented by Xi Hua March 1st,Spring05. Outline. Introduction of Current Database Systems Concept of Hippocratic Database Principles of Hippocratic Database Strawman Design Problems
E N D
Hippocratic Databases Paper by Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu CS 681 Presented by Xi Hua March 1st,Spring05
Outline • Introduction of Current Database Systems • Concept of Hippocratic Database • Principles of Hippocratic Database • Strawman Design • Problems • Conclusion
Fundamental Properties and Capability of current database • Managing persistent data. • Accessing a large amount of data efficiently. In addition, the following capability are found universally. • Support for at least one data model. • Support for certain high-level languages. • Transaction management • Access control • Resiliency
Statistical Databases • Goal Providing statistical information without compromising sensitive information about individuals • Broadly classified Techniques • Query restriction • Data perturbation • Common character with Hippocratic databases Preventing disclosure of private information
Secure Databases • Goal Sensitive information must be transmitted over a secure channel and stored securely. • Comparing with Hippocratic Database Hippocratic database benefit from secure databases and has been inspired a lot from it.
Principles of a Hippocratic Database • Privacy Regulations and Guidelines • OECD Guidelines (Organization for Economic Co-Operation and Development) • Most well known • Set out 8 principles for data protection: collection limitation, data quality, purpose specification, use limitation, security safeguards, openness, individual participation and accountability.
Ten Principles Rooted in the privacy regulations and guidelines. • Purpose Specification • Consent • Limited Collection • Limited Use • Limited Disclosure • Limited Retention • Accuracy • Safety • Openness • Compliance
Strawman Design • A Use Scenario Mississippi Alice Bob Trent Mallory • Architecture as below
Strawman Design • Privacy Metadata Define purpose, and for each piece of information collected for that purpose. -external-recipients -retention-period -authorized-users
Strawman Design • Data Collection -Matching Privacy Policy with User Preference -Data Insertion -Data Preprocessing
Strawman Design • Queries -Before Query Execution -During Query Execution -After Query Execution
Strawman Design • Retention Deletes data items that have outlived their purpose. If has more than one purpose, kept the period time based on the longest retention time, e.g. Alice’s information in the order table will be deleted after 1 month, while Bob’s information will be kept for 10 years.
Strawman Design For the purchase purpose: • All the attributes have a retention period of 1 month • The name and shipping-address are given to the delivery company • The name and credit-card-info are given to the credit-card company
P3P • Platform for Privacy Preference -Developed by the World Wide Web Consortium -Motivation: enable user to gain more control on their personal information. -Technology: encode data-collection in a XML format known as a P3P policy programmatically compared against user’s privacy preference. -Problem: no mechanism for making sure sites act according to their stated policies.
P3P and Hippocratic Databases • Similarity The concept of Hippocratic Databases is similar with the concept of P3P’s purpose and retention. • How to implement in Hippocratic Databases? Take P3P policies, process them through the privacy metadata processor, and generate the corresponding data structures in Hippocratic Databases system.
Problems • Language - Are P3P formats are sufficient for specifying policies and preferences in Hippocratic Databases? P3P is for web shopping, but Hippocratic Databases being used in many fields, e.g. finance, insurance and etc. Hence, we need to develop a policy specification language use the work done for P3Pas the starting point. -Tradeoff between expressibility and usability
Problems • Efficiency -Cost of privacy checking Techniques for reducing the cost of each check e.g. encode the set of purposes associated with each record by setting a bit in a word. The record access control check then requires a bit-wise AND of two words, and check the result. -Impact disk space and the complexity of adding checks e.g. chosen an alternate implementation in the strawman design where we only tag the records in the customer table with purpose. When scan records in the order table, we do a join on customer-id to get the purpose for those records.
Problems • Limited Collection -Principle: a query accesses only the data values needed to fulfill its purpose and the database store the minimal information necessary to fulfill all the purposes. -Problems • Access analysis • Granularity analysis • Minimal query generation
Problems • Limited Disclosure -Dynamically determine the set of recipients provides limited disclosure a challenge. -Solution: borrows from public-privacy key technology.
Problems • Limited Retention We can delete a record from a Hippocratic database when no longer any purpose associated with it. But how do we delete a record or field from the logs and past checkpoints, without affecting recovery?
Problems • Safety -The storage media on which the tables are stored might suffer from attacks. -Solution: encryption of database files on disk or selective encryption of fields might help
Problems • Openness How does the user access the information he need? How does the database know he is really that user not someone else?
Problems • Compliance -Universal logging -Tracking Privacy Breaches
Conclusion • Enunciated the key privacy principles that Hippocratic databases should support • Presented a strawman design for a Hippocratic databases. • Identified the technical challenges and problems.