1 / 34

Data Security

Data Security. Ram Gopal University of Connecticut. ( Robert Garfinkel , Paulo Goes, Manuel Nunez, Daniel Rice, Steven Thompson ). Research Objectives. Overview Security of Microdata with Individual Identifiers.

Download Presentation

Data Security

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Security Ram Gopal University of Connecticut (Robert Garfinkel, Paulo Goes, Manuel Nunez, Daniel Rice, Steven Thompson)

  2. Research Objectives • Overview • Security of Microdata with Individual Identifiers Maximize the utility of information provided to users while maintaining the security of confidential information.

  3. Confidentiality-Related Identity-Related Confidential • Security Considerations • Disclosure of Confidential Information • Identity Disclosure

  4. Protection of Confidential Information • Query Restriction • Perturbation • Camouflage • Hybrid Methods

  5. Query Restriction • Answer queries from users if confidentiality is not violated

  6. Query Answering Process User Query Compute Query Answer User Log Confidentiality Check Answer Query Compromise? Update No Yes Reject Query

  7. Query1: x1+x2+x3+x4 (answer=350) Min/Max xi x1+x2+x3+x4 = 350 20 ≤ xi ≤ 150 Query2: x1+x3 (answer=151) Min/Max xi x1+x2+x3+x4 = 350 x1+x3 = 151 20 ≤ xi ≤ 150 Min/Max xi x1+x2+x3+x4 = 350 Query3 x1+x3 = 151 20 ≤ xi ≤ 150 Illustration Query1:  Query2:  Query3: x1+x2 (answer=180) x1+x2 = 180

  8. Query: min{x1,x2,x3,x4} (answer=49) xi≥ 49 {x1,x2,x3,x4} = 49 Query Restriction • Binary Data with COUNT queries • Other Query Types (MIN, VAR) Maximal Subset Selection is NP-Hard

  9. 87.89 52.07 134.60 101.95 71.15 24.76 46.81 94.79 74.24 49.57 Perturbation 82.32 -19.68 • Data Swapping/Shuffling • Binning

  10. Perturbation

  11. Camouflage • Interval Answers • Answer Guarantee • Interval Protection • Storage Efficiency • Computational Efficiency • “Good” Query Answers Record 2 Record 1

  12. Camouflage

  13. Summary of Techniques

  14. Hybrid Techniques Query Restriction and Perturbation

  15. Min/Max xi x1+x2+x3+x4 = 350 x1+x3 = 151 20 ≤ xi ≤ 150 Perturbation Min/Max xi x1+x2+x3+x4 = 350 x1+x3 = 151 20 ≤ xi ≤ 150 x3 = 49 Insider Threats What if a user knows that a3 = 49? Query Restriction Camouflage

  16. Emerging Application Areas • Data Mining (c.f. Sarkar, UT-Dallas) • ‘Inverse’ Problem (c.f. Sarkar, UT-Dallas) • Trading Privacy • Privacy in Distributed Settings (J. Canny, Berkeley)

  17. Identity Disclosure Strip Explicit Identifiers

  18. Security of Microdata with Individual Identifiers Along with Can we provide microdata in the following form? As much as possible

  19. Channel View Input Channel 1 (N,N,N) 2 Input Channel 2 (L,H,L) 1 Input Channel 3 (N,N,H) 2 Input Channel 4 (N,H,N) 3 Input Channel 5 (L,N,V) 1 Input Channel 6 (H,H,L) 2

  20. Option 1: Full Revelation Input Channel 1 (N,N,N) 2 Input Channel 2 (L,H,L) 1 Input Channel 3 (N,N,H) 2 Input Channel 4 (N,H,N) 3 Input Channel 5 (L,N,V) 1 Input Channel 6 (H,H,L) 2

  21. Option 2: Minimal Revelation Input Channel 1 (N,N,N) 2 Input Channel 2 (L,H,L) 1 Output Channel 1 (ALL) Input Channel 3 (N,N,H) 2 Input Channel 4 (N,H,N) 3 11 Input Channel 5 (L,N,V) 1 Input Channel 6 (H,H,L) 2

  22. Option 3: Partial Revelation Input Channel 1 (N,N,N) 2 Output Channel 1 Input Channel 2 (L,H,L) 1 Input Channel 3 (N,N,H) 6 2 Input Channel 4 (N,H,N) 3 Output Channel 2 Input Channel 5 (L,N,V) 1 5 Input Channel 6 (H,H,L) 2

  23. Formulation: Inputs

  24. Formulation: Risk Tolerance

  25. Formulation: Risk Tolerance

  26. Assign All x11+x12 = 75 x11 Output Channel (2 and 3) Input Channel 1 75 x12 Input Channel 2 ALL Safety Net Input Channel 3 Output Channel Input Channel ≥ 1 safe channel Input Channel (L,L,L) (V,V,V) Non-Intuitive Formulation: Constraints Truthful Random Assignment Input Channel 4

  27. Formulation: Objective Minimize Information Loss 45 Input Channel 1 x11 Output Channel 1 2×(x11+x21) x21 89 (45+89)×(x11+x21) Input Channel 2 x22 Output Channel 2 75 x32 Input Channel 3

  28. Formulation

  29. Output Channel 1 8,755 Output Channel 2 7,446 Output Channel 3 3,002 Output Channel 4 727 Output Channel 5 6,521 Output Channel 6 9,442 Illustration 2,098 Input Channel 1 1,994 Input Channel 2 4,055 Input Channel 3 3,002 Input Channel 4 4,921 Input Channel 5 9,875 Input Channel 6 9,442 Input Channel 7 506 Input Channel 8

  30. Simulation Analysis • 1 million records • 200 input channels/2048 output channels

  31. Results

  32. Results

  33. Results

  34. Concluding Remarks • Useful data can be released even when the risk is high • Consider Insider Threats • Integrate with Perturbation/Camouflage

More Related