210 likes | 311 Views
Relational Semantic Hiding Databases (RSHDB). Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University. Cloud Computing. Cloud computing paradigm provides a new concept of IT management. Business purchases IT services from Clouds Cost saving
E N D
Relational Semantic Hiding Databases (RSHDB) Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University
Cloud Computing • Cloud computing paradigm provides a new concept of IT management. • Business purchases IT services from Clouds • Cost saving • Unlimited computing power • Charged by usage • More secure? • Better resource utilization, thus green computing
Cloud Computing • Cloud computing also has some known problems • Trust issues • Data privacy and integrity • Non-transparency of data locations • Liability issue
Outsourcing Databases • Database-as-a-service is an emerging service starts to appear in cloud industry. • Clients has the flexibility to design an application as a database that is suitable for their business. • Outsource the database to clouds. • Clouds is able to execute queries over the database upon client’s requests. • Clouds (may not be trusted) have the total control of data. • Data privacy/integrity is a big concern .
Encrypted Databases • An extreme approach to protect data privacy: • Encrypt the whole database and then outsource the encrypted database to clouds. • This approach works if a practical fully homomorphic encryption (FHE) algorithm exists. • FHE: arithmetic, rational comparisons can be applied directly to ciphers. • No practical and efficient FHE exists.
RSHDB • RSHDB (relational semantic hiding databases) is a proposed database system that is able to hide semantics from DBAs. • Suitable for business to outsource their business applications as a RSHDB instance to Clouds. • Enable the DBAs or DBMS in clouds to operate on the RSHDB databases without knowing private business information.
RSHDB: Idea of Hiding Semantics • Idea of semantic hiding in RSHDB: • An XYZ company has a PAYROLL database, in which a record in a table EMPLOYEE shows that John SmithSALARY is 63,000. • An ? company has a ? database, in which a record in a table ? shows that ?? is 63,000.
RSHDB: Basic Operations • Basic database operations: • Arithmetic: add or multiply numeric data. • Equality test: test the equality of two data items. • Rational comparison: decide A> B or A < B. • Substring matching: decide whether a string A is a substring in another string B • Other database operations: sorting, searching, aggregate functions, set operations are extension/combination of basic operations.
RSHDB: Data Types • Data types: • NC-type: Numeric with Comparison only. • NCA-type: Numeric with both Comparison and Arithmetic. • SC-type: String with Comparison only. • SCS-type: String with both Comparison and Substring matching.
RSHDB: Design Goal • Partially encrypts the database so that the cloud is able to execute queries over encrypted data. • Encrypt enough information (but not all) to hide semantics from data operators. • Minimize the impacts for the DBMS, the SQL, the hosting clouds, and the clients.
RSHDB: Encryption Strategy • Use a secure deterministic encryption for all semantic telling information: database, table, attribute names. • String type data is also semantic telling: always encrypted. • SC-type: order-preserved encryption (less secure) • SCS-type: • char-by-char (less secure) order-preserved encryption. • word-by-word order-preserved encryption.
RSHDB: Encryption Strategy • Numeric data itself reveal less semantics. • NC-type: order-preserved encryption. • Example: bdate data • NCA-type: no practical homomorphic encryption available for this type of data. • Leave the data in clear • Homomorphic encoding (not too much help for security) • Example: salary data
Impacts • The DBMS: Need to be semantic hiding aware • The SQL: New data types for DDL • The hosting clouds: • More storage space for encrypted data. • Install semantic hiding aware DBMS • The clients: Install an query API: • Perform encryption • Convert SQL query to semantic hiding query • Perform decryption • Return the result to the clients
Semantic Hiding Query (SHQ) • The sensitive information or data is encrypted in SHQ. • To make a query to a RSHDB, the SQL query must be a SHQ. • Example • Retrieve the name and salary of each employee in ‘Research’ department whose salary is more than $50,000, sort the report in ascending order of names.
SHQ Example select EMPLOYEE.NAME, EMPLOYEE.SALARY from EMPLOYEE, DEPARTMENT where EMPLOYEE.DEPT_NO = DEPARTMENT.DEPT_NO AND DEPT_NAME = ‘Research’ AND EMPLOYEE.SALARY > 50000 asc EMPLOYEE.NAME; --------------------------------------------------------------------------- select T.A1, T.A6 from T, R where T.A3 = R.B2 AND R.B1 = Y21 AND T.A6 > 50000 asc T.A1;
SHQ Result Query API decrypts the result and return to the clients
Research Issues • Storage requirement. • Is order-preserved encryption secure enough? • More secure encryption + order-preserved hashing? • Guessing the semantics from the range and format of NCA-type data in clear. • Adding noises? • RSHDB’s DBMS has a weaker domain constraint enforcement. • All encrypted data are in type of bit-string
Research Issues • Char-by-char versus word-by-word encryption for SCS-type data. • Flexibility, security and space. • Who should develop the query API? • Performance downgrade: • Implementation and simulation • Real world databases and queries
Future Work • Designing algorithms for data integrity protection for outsourced database. • Completeness • Non-forgery • Freshness • Adding data integrity protection to RSHDB is challenging.