1 / 35

Lecture 17: Executing SQL over Encrypted Data in Database-Service-Provider Model Professor Chen Li

Lecture 17: Executing SQL over Encrypted Data in Database-Service-Provider Model Professor Chen Li. Executing SQL over Encrypted Data in Database-Service-Provider Model. Hakan Hacigumus University of California, Irvine Bala Iyer IBM Silicon Valley Lab. Chen Li

Download Presentation

Lecture 17: Executing SQL over Encrypted Data in Database-Service-Provider Model Professor Chen Li

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 17: Executing SQL over Encrypted Data in Database-Service-Provider Model Professor Chen Li

  2. Executing SQL over Encrypted Data in Database-Service-Provider Model Hakan Hacigumus University of California, Irvine Bala Iyer IBM Silicon Valley Lab. Chen Li University of California, Irvine Sharad Mehrotra University of California, Irvine SIGMOD 2002, Madison, Wisconsin, USA

  3. We want to store the data on “a server” User What do we want to do? Server User Data Encrypted User Database Untrusted Administrator • But the problem is we do not trust “the server” for sensitive information! • encrypt the data and store it • but still be able to run queries over the encrypted data • do most of the work at the server • If the server is trusted, ICDE 2002

  4. Application Service Provider (ASP) Model for Database DB management transferred to service provider for backup, administration, restoration, space management, upgrades etc. use the database “as a service” provided by an ASP use SW, HW, human resources of ASP, instead of your own User Why is it important anyway? Server Encrypted User Database Untrusted Administrator (Untrusted) Application Service Provider

  5. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  6. Query Executer Temporary Results Encrypted User Database Query Translator Metadata User System Architecture Client Site Server Site Encrypted Results Client Side Query ? Server Side Query Service Provider Original Query ? Actual Results ?

  7. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  8. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  9. Relational Encryption Server Site • Store an encrypted string – etuple – for each tuple in the original table • This is called “row level encryption” • Any kind of encryption technique can be used • Blowfish encryption algorithm is used for this work • Create an index for each (or selected) attribute(s) in the original table

  10. Partition (Bucket) ids 2 7 5 1 4 0 200 400 600 800 1000 • Identification function assigns a partition id to each partition of attribute A • e.g. identR.A( (200,400] ) = 7 • Any function can be use as identification function, e.g., hash functions Domain Values Building the Index:Partition and Identification Functions • Partition function divides domain values into partitions (buckets) Partition (R.A) = { [0,200], (200,400], (400,600], (600,800], (800,1000] } • partitioning function has an impact on performance as well as privacy

  11. Partition (Bucket) ids 2 7 5 1 4 0 200 400 600 800 1000 Domain Values Mapping Functions • Mapping function maps a value v in the domain of attribute A to the id of the partition which value v belongs to • e.g. MapR.A( 250 ) = 7, MapR.A( 620 ) = 1

  12. Storing Encrypted Data R = < A, B, C >  RS = < etuple, A_id, B_id, C_id > etuple = encrypt ( A | B | C ) A_id = MapR.A( A ), B_id = MapR.B( B ), C_id = MapR.C( C ) Table: EMPLOYEES Table: EMPLOYEE

  13. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  14. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  15. Mapping Conditions Q: SELECT name, pname FROM emp, proj WHERE emp.pid=proj.pid AND salary > 100k • Server stores attribute indices determined by mapping functions • Client stores metadata and utilizes that to translate the query Conditions: • Condition  Attribute op Value • Condition  Attribute op Attribute • Condition  (Condition Condition) | (Condition Condition) | (not Condition)

  16. 2 7 5 1 4 Partition Ids 0 200 400 600 800 1000 Domain Values Mapping Conditions (2) Example: • Attribute = Value • Mapcond( A = v )  AS = MapA( v ) • Mapcond( A = 250 )  AS = 7

  17. Mapping Conditions (3) • Attribute1 = Attribute2 • Mapcond( A = B ) N (AS = identA( pk )  BS = identB( pl )) where N is pkpartition (A), plpartition (B), pkpl  C : A = B  C’ : (AS = 2 BS = 9)  (AS = 4 BS = 9) (AS = 3 BS = 8)

  18. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  19. Relational Operators over Encrypted Relations • Partition the computation of the operators across client and server • Compute (possibly) superset of answers at the server • Filter the answers at the client • Objective : minimize the work at the client and process the answers as soon as they arrive without requiring storage at the client Operators studied: • Selection • Join • Grouping and Aggregation • Sorting • Duplicate Elimination • Set Difference • Union • Projection

  20. 2 7 5 1 4 0 200 400 600 800 1000 A=250 A=250 Client Query D TABLE A_id = 7 E_TABLE Server Query Selection Operator c( R ) = c( D (SMapcond(c)(RS)) Example:

  21. Client Query A=B D C’ C E_EMP E_PROJ Server Query EMP PROJ Join Operator R c T = c( D ( RS SMapcond(c) TS) Example: C : A = B  C’ :(A_id = 2 B_id = 9) (A_id = 4 B_id = 9) (A_id = 3 B_id = 8)

  22. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  23. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  24. name,pname name,pname e.pid = p.pid e.pid = p.pid salary >100k D salary >100k PROJ D Encrypted(PROJ) EMP Encrypted(EMP) Server Query Query Decomposition Q: SELECT name, pname FROM emp, proj WHERE emp.pid=proj.pid AND salary > 100k Client Query

  25. Query Decomposition (2) Client Query name,pname Client Query name,pname e.pid = p.pid salary >100k D e.pid = p.pid salary >100k D D D E_PROJ s_id = 1 v s_id = 2 E_PROJ E_EMP E_EMP Server Query Server Query

  26. Query Decomposition (3) Client Query Client Query name,pname name,pname salary >100k  e.pid = p.pid e.pid = p.pid salary >100k D D D e.p_id = p.p_id s_id = 1 v s_id = 2 E_PROJ s_id = 1 v s_id = 2 E_PROJ E_EMP E_EMP Server Query Server Query

  27. Query Decomposition (4) Client Query name,pname Q:SELECT name, pname FROM emp, proj WHERE emp.pid=proj.pid AND salary > 100k QS: SELECT e_emp.etuple, e_proj.etuple FROM e_emp, e_proj WHERE e.p_id=p.p_id AND s_id = 1 OR s_id = 2 QC: SELECT name, pname FROM temp WHERE emp.pid=proj.pid AND salary > 100k salary >100k  e.pid = p.pid D e.p_id = p.p_id E_PROJ s_id = 1 v s_id = 2 E_EMP Server Query

  28. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  29. Talk Outline • System Architecture • How to create Metadata: Relational Encryption and Storage Model • Query Decomposition and Relational Operators • Query Decomposition – Examples • Experimental Results • Conclusion

  30. Experimental Evaluation • Data • TPC-H database, scale factor 0.1 • Queries • Based on TPC-H Queries Q#6 and Q#3 • Partitioning Strategy • Equi-depth histograms for the first set of experiments • Equi-width histograms for the second set of experiments

  31. Effect of Number of Buckets in Non-Join Query • Client and communications costs decreases with increasing number of buckets due to better filtering at the server • Server cost doesn’t decrease as much, table scan remains best choice in the optimizer

  32. Effect of Number of Buckets in Non-Join Query • Single Server: Server is trusted and performs all operations including decryption on site • Shows that proposed query execution protocol doesn’t introduce significant overhead

  33. Effect of Number of Buckets in Join Query • Sharp decrease in query response time with increase in the number of buckets due to better filtering at the server • Client side query response time is greater than server side query response time due to dominant decryption cost on the query (second graph)

  34. Effect of Number of Buckets in Join Query • Single Server: Server is trusted and performs all operations including decryption on site • Consistent with the previous results showing proposed query execution protocol doesn’t introduce significant overhead

  35. Conclusion • ASP model is a promising solution for enterprise computing in Internet era • We studied data privacy problem • in the context of ASP model • when the ASP is not trusted • Proposed solution • encrypts data, creates “coarse indexes” and stores the data at ASP • allows only data owner to decrypt the data • With query decomposition • most of query execution performed at ASP • client only performs encryption/decryption, filtering and continues to benefit from ASP model

More Related