1 / 23

A model for data revelation

A model for data revelation. Poorvi Vora Dept. of Computer Science George Washington University. “Security” frameworks. Binary Divide the world into trusted and untrusted parties Provides complete revelation of information or complete protection

Download Presentation

A model for data revelation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A model for data revelation Poorvi Vora Dept. of Computer Science George Washington University

  2. “Security” frameworks Binary • Divide the world intotrusted and untrustedparties • Provides complete revelation of information or complete protection E.g. multiparty computation, encrypted data Poorvi Vora/CS/GWU

  3. Even a statistic or aggregate reveals “private” information Secure multiparty computation reveals f(x1, x2, .. xn) And nothing more. Yet, this reveals information about all xi Thus, typical security assurances not enough Poorvi Vora/CS/GWU

  4. What is privacy • Control over information • Extent of information revelation Tensions between: Access to aggregate information for community Vs. Individual control reputation vs. predjudice Poorvi Vora/CS/GWU

  5. Individual control requires more than binary security of personal information Information is often given up for something in return • Safeway card • Monthly charge to be kept of phone books • Information for community statistics: • Health statistics • Collaborative filtering/personalization in virtual communities Poorvi Vora/CS/GWU

  6. A model: introduce uncertaintymaximum uncertainty (i.e. secrecy) corresponds to crypto protocols • Alice and Bob determine: • a binary data point from Alice’s personal information, x • a probability of truth, p • a return, y • Alice reveals a variable z = x with probability p • Bob provides, in return, y • z exists in the ether as Alice’s value x with probability p This is not mutually exclusive with cryptographic protection (p=0.5 is cryptographic) Used in public health community for twenty odd years Poorvi Vora/CS/GWU

  7. Outcome Protocol is a mathematical game between Alice and Bob Optimal situation not when no information is revealed, but when Alice gets maximum benefit for her information Think about this: should women in Africa test for HIV when they will certainly not obtain any treatment for it? Poorvi Vora/CS/GWU

  8. An analogy • The protocol is a communication channel • The sender is Alice, the receiver (malicious?) Bob • The probability of error is the probability of a lie Poorvi Vora/CS/GWU

  9. Security properties of randomization • Repeated queries Error  0 as n   And n   as Error  0 • Cost to attacker increases without bound if error not bounded above zero • This is a repetition code over channel Poorvi Vora/CS/GWU

  10. Other attacks Query 1: Graying? Query 2: Balding? Query 3: Weight? Query 4: Sports? Really asking about age and gender How does one characterize all such attacks? What can one say about security wrt such attacks? Poorvi Vora/CS/GWU

  11. An analogy • The protocol is a communication channel • The sender is Alice, the receiver (malicious?) Bob • The probability of error is the probability of a lie • The attributes that Bob wants to determine form the message Poorvi Vora/CS/GWU

  12. A simple attack • Query 1: Female? • Query 2: Over 40? • Query 3: Losing Calcium? Query 3 checks answers to Query 1 and 2 Is a parity-check it Poorvi Vora/CS/GWU

  13. An analogy • All attacks are communication over channel • Good attacks are codes • What Bob queries is a codeword bit • What he receives is the transmitted codeword that he decodes Poorvi Vora/CS/GWU

  14. Shannon’s theorems apply In fact, assuming any functions of Alice’s data points as queries (adaptive, related queries) and error probability  0 as n  The number of queries required per bit of entropy is asymptotically tightly bound below by the inverse of the channel capacity Above this bound, error tends exponentially to 0 Below it, it increases exponentially with n Poorvi Vora/CS/GWU

  15. Questions • How does one determine the entropy of a particular data set, or a general data set? • What kinds of attacks are computationally feasible? • This was a very powerful attacker. What are reasonable limits on the attacker’s abilities? • Result in itself, independent of model. • Partly published at Int. Symp. Info. Theory, 2003 • Journal paper in review, at website Poorvi Vora/CS/GWU

  16. Value-free model • Human rights aspects covered through crypto protocols • Necessary health information and community information can be gathered • Consumer behaviour treated through this game • Criticism: very adversarial model Poorvi Vora/CS/GWU

  17. Another application: anonymous deliveryCrowds: Reiter and Rubin/Lucent and AT&T • At node i+1: node i more likely than any other • Receiver: Node i+1 • Message: sending node • Received symbol: Node i • Channel characteristic: • Probability that true sender is Node i, • Probability that other nodes are senders • Traffic analysis/data mining: correlations among senders (communication across channel, less efficient than some error-correcting code) B A E C D N nodes; pf probability of forwarding Poorvi Vora/CS/GWU

  18. An example of model use to measure the value of informationwith Yu-An Sun and Sumit Joshi • Auction bids reveal much about an individual’s profile • Consider the Vickrey – sealed second highest bid – auction • Optimal strategy: to bid one’s valuation • Bids (and hence valuations) can be protected with secure multiparty computation • But, bids allow determination of market demand (efficient markets) • Need for an aggregate value, not well-defined at the moment of the auction Poorvi Vora/CS/GWU

  19. Variably Private Vickrey – Bidding RoundIntroduce uncertainty • The seller announces a minimum sale price and a maximum randomization setting. • Each bidder submits a sealed interval containing her bid. The size of the interval is her choice. • In the running with high end, committed to low Poorvi Vora/CS/GWU

  20. Variably Private Vickrey – Revealing Round • Bidders not in the running will reveal no more information on their valuations. • Largest of the others will reveal which half of their interval contains valuation Poorvi Vora/CS/GWU

  21. Sale Price Buyer pays Seller gets { Divided among all bidders proportional to the interval width Poorvi Vora/CS/GWU

  22. Properties? • Provides various demand statistics • In general, accuracy of future bid estimation lower for more uncertainty • Allows for bidder to vary uncertainty, and pay for it • Allows seller to obtain more than regular Vickrey, depending on how much information is valued • Bidder with highest valuation still wins auction as long as she can tolerate revealing her valuation to the extent required. Poorvi Vora/CS/GWU

  23. Summary A model that we hope will: • Provide choices not currently typically available to users • Extend the security framework to include problems like those in statistical databases • Provide a means of measuring uncertainty in situations where there is some not none or complete • Include other leakage from security-related protocols such as anonymous delivery and ciphers • Be useful for measuring the economic value of information Poorvi Vora/CS/GWU

More Related