400 likes | 564 Views
Privacy Preserving Infrastructural Requirements. Brad Rosen Professor Joan Feigenbaum Ta: Ganghua Sun CS 557: Sensitive Information In the Wired World. Definition and Redefinition of Terms Privacy Enhancement vs Privacy Preservation Background on PET’s [Taxonomy et al]
E N D
Privacy Preserving Infrastructural Requirements Brad Rosen Professor Joan Feigenbaum Ta: Ganghua Sun CS 557: Sensitive Information In the Wired World
Definition and Redefinition of Terms Privacy Enhancement vs Privacy Preservation Background on PET’s [Taxonomy et al] Current Infrastructure Next-Generation Infrastructure Trusted Proxies Case Study: iPrivacy Conclusions Presentation Outline
Terms • PII: Personally Identifiable Information • Trusted Platform: Either – for a given input its outputs or known, or the authenticity of its software can be verified. • Privacy Enhancing Technology: A program or programme of action which increases user’s control about what happens to data. • Proxy: A server that redirects internet requests.
What is a Privacy “Enhancing”Technology? • As we’ve seen [Ashley/Bobby] PETs “enhance” user control over “private” data. • Some of these data are intrinsically hard to protect due to the nature of the current Internet. [Server-Side IP Address “Cookies”] • All of which require a certain degree of “trust” in another party. [Proxies vs Cookies] • Paradigm shift in how we think about privacy…
Background on PET: Taxonomy • Within our discussions of PETs, we have seen a number of active [Cookie Cutters, Proxies] and passive [Policy Tools] technologies. • Encryption Tools are quickly becoming ubiquitous in all aspects of modern computing and are a prerequisite to, not an enhancement for, protection of “sensitive” data. • Filtering tools [SPAM] merely seek to fix a (heavily) broken mail standard (SMTP/ESMTP)
Privacy Enhancing or Privacy Preserving? • Reasonable Expectations • When a user performs certain actions on the internet [like shopping or browsing] they have the inherent reasonable expectation of privacy. [Real life library] • These expectations were not design goals of the original internet. [p3p…] • Rather than tack on privacy “enhancements” to an architecture that was not designed for them, we need to look at building privacy preservation in from the ground up.
Background on PET: Taxonomy Policy Tools • P3P is somewhat ambiguous, but suppose there was a non-ambiguous policy language… • “Perfect P3P” [P4P] data may be very large in size. • Users will need semi-regular updates of P4P data. • Without proxied browsing, even connecting to a site may violate the user’s preference [IP tracking] …
Background on PET: Taxonomy Pop-Up Blockers • Pop-Up windows are a vestige of arguably terrible design decisions in JavaScript/EMCA Script. • While Pop-Up windows are an annoyance and an inconvenience, they are by no stretch of the imagination an invasion of privacy. [They may be directed at sites which attempt to perform such an act – which should concomitantly be blocked by other tools.] • Pop Up Blockers are not a privacy enhancing technology.
Background on PET: Taxonomy Cookie Cutters • Cookies were originally a “hack” to get around the fact that HTTP is a stateless protocol. • Anything that can be accomplished with cookies can be accomplished server-side with a unique identifier [such as IP address] • In the absence of proxy tools, cookie cutters are not enough to prevent tracking of click data, etc. • Blocking cookies is trivial.
Background on PET: Taxonomy Proxy Tools • Any device which masks a user’s IP address may arguably be a proxy tool. [NAT home routing] • While proxy tools [and blocked cookies] ensure no tracking data is leaked, it requires trust in the organization running the proxy. • These organizations may have unclear, vague, or unfavorable terms to those who use their proxies. • The question of “anonymity” vs “identity protection” remains. [crime, traceability, etc]
Background on PET: Taxonomy SpyWare • The collection of data by a program resident on a computer. Often installed with “freebies” (like Kazaa) – but could be installed by a buffer overflow in any web-enabled program. [IE] • There are those in the community that consider spyware to be an actual intrusion/trespassing. • SpyWare removal tools [AdAware, etc] blur the line of PETs, simply because no user had the reasonable expectation of SpyWare being installed on his/her computer in the first place.
The status quo of the internet relies on a number of standards: IPV4, (E)SMTP, HTTP, JavaScript/EMCA Script, Applets/ActiveX. One of the reason that good PETs have failed to materialize is that the current infrastructure was designed with quick spread of information in mind – not protection thereof. [Lack of Palladium…] Looking at the end-to-end traffic, we can see a number of holes… DISCUSSION: WHAT IS THE FIRST HOLE??? Current Infrastructure
When a user points their browser at a website, that name must be translated into the IP address where that website is hosted. This is done by the Domain Name System [DNS] DNS Requests are sent in clear text. Most ISPs provide their own domain name servers [caching servers] for their users – and thus could track website visit data trivially. DNS hi-jacking is a problem – setting up a website which looks like Amazon or EBay tricks people plenty already – imagine if it was actually at Amazon.com or EBay.com Current Infrastructure:HTTP End-To-End
After DNS resolution, the HTTP request is sent to the website. The request is carried by a number of intermediary hops [routers] all of which know the source and destination IP address. Routers operate on a store-and-forward basis – they store the packet locally in case it is lost further down the chain. There is no user assurance that the intermediate routers actually discard those packets. A human could perform reverse-dns lookup or simply visit the IP address to see what someone else has been doing… Current Infrastructure:HTTP End-To-End
As users navigate the website, both cookies and server-side storage [indexed by IP, username, or some other identifier] may be used to track their browsing habits. The status quo of web-servers [Apache and IIS] store IP addresses for every object served – causing the proliferation of “trackers” and 1x1 pixel “web bugs.” Current Infrastructure:HTTP End-To-End
Transfer of “sensitive” data may be done via HTTPS. [HTTP using secure sockets layer] Any data sent without using HTTPS can be snooped at any point between the user and the ultimate server. [It is most often used to protect credit card data.] Danger: Not all data is sent via HTTPS, tracking is still available, but most of all … [discussion] Current Infrastructure:HTTP End-To-End
Once users have submitted data, they simply have no control whatsoever over that data. Sites may post a privacy policy… …and then violate it later. [JetBlue, eToys] There are no assurances as the security of and access to the data [of internal or external parties] “Sharing” with “affiliated parties” has become common – yet typically there is no mention of how these parties are bound. Current Infrastructure:HTTP End-To-End
Due to size of address space, NAT is common. IP Fragmentation/NAT “mucks” with many common encryption tools: [Kerberos, IPSEC] No end-to-end security measures truly available. “Spoofing” is common. [Spoofing may be used to defeat credit card companies statistically monitoring purchasing IP address] Current Infrastructure:IPV4
Originally invented to allow for client-side functionality, a number of “cute hacks” can be used to collect data. Most notably, the presence of “new frames” opened that may jump around the screen with pornography, Viagra ads, etc, and may or may not contain web-bugs. Current Infrastructure:JavaScript/EMCA Script
Applets [and to an extent ActiveX controls, despite being MSIE only] have a much greater range of operation than simple web-bugs and pop-ups. Tools already exist for permission control of Applets and ActiveX controls [bugs in their implementation aside] This is a problem of user-education, not a technological problem! MANY USERS BLINDLY CLICK YES! [Control Discussion] Current Infrastructure:Applets/ActiveX Controls
(E)SMTP does not require authentication when sending an email. Spammers can simply connect to “open relays” that allow them to freely send mail. SHOW DEMO: HELO, MAIL FROM, RCPT TO, DATA, QUIT. IP Tracking/Spoofing Cross-layer flaws lead to fundamental problems. Current Infrastructure:(E)SMTP
Many of the tools are “on the horizon” or “have been discussed” There are significant barriers to adoption. Numerous trade-offs involved in this paradigm shift. Information is ubiquitous Information is ubiquitously controlled by the person or entity it concerns. Next-Generation Infrastructure(Privacy Preserving)
Assume that the vagueness of P3P has been eliminated producing P4P. Assume that “Palladium-like” features are available on all platforms. [Including Routers!] Assume there are [at least] a few “signing bodies” – bonded entities that are willing to certify [via their own remote attestation] that certain websites are indeed running the software they claim to be running. [Randomized Testing] Assume there are [at least] a few “verifying bodies” – bonded entities that are willing to certify [via code inspection, mathematical induction or exhaustive proof] that the program certified by a signing body does indeed conform to the P4P profile espoused by the program’s owners. Assume there are “trusted proxies” – [Discussion Later] Next-Generation InfrastructureNecessary Assumptions
Proposed solution to hi-jacking is DNSSEC – which again requires digital signing and distribution of verification keys. DNS requests should be encrypted [ssl] to prevent snooping [by other users on the same machine or neighbors on a home cable/dsl network/university subnet] Users not wanting their ISP to be able to associate dns-lookups with themselves should tunnel all DNS resolution requests through a trusted proxy first. Next-Generation Infrastructure:HTTP End-To-End
After DNSSEC resolution, the user’s computer must “test the path” [like trace route] [secure ICMP hand waving] All routers must attest to: Running a known routing algorithm which will destroy packets after their acknowledgement by the next hop. Will not permit access to those packets by a local accessing user Will not store the packets in an unencrypted form Will not forward the packets to any router which does not meet these same requirements. Again, the privacy-obsessive will need to tunnel these requests through a proxy if they want to prevent the destination website from knowing their IP address. Next-Generation Infrastructure:HTTP End-To-End
What if a user does not want to even visit sites that do not meet their P4P guidelines? [Users not behind a proxy fearing IP address use for statistics] P4P profiles should be stored at: A central authority [or authorities] OR Piggybacked onto the secure-DNS system Pop a warning “This site does not follow your specified……do you still want to connect?” Next-Generation Infrastructure:HTTP End-To-End
Already, many modern browsers prevent “remote-loading” of images on user request. Part of the P4P and trust conventions are that web-pages cannot serve pages that will produce remote-loaded images/applets to servers with less-stringent P4P policies. This is possible to do at page-generation time. Tracking of click data: Could be prevented either in a P4P policy or by a zealous user with a proxy. Next-Generation Infrastructure:HTTP End-To-End
Secure data transfer is still required to prevent snooping on the first level of connections Some mechanism [encryption or parts of IPV6] must be used to ensure that packets cannot be sniffed during inter-router movement. Any changes in path must re-initiate the attestation check. Next-Generation Infrastructure:HTTP End-To-End
Now, once users have submitted data, the user knows they are running on a trusted platform and that no other application except the signed one may access that data. Assuming the user has accepted the P4P policy posted, the user will know exactly what the site can and cannot do with their data. This reverses control of the data. This does not allow for customized privacy on a per-user granularity. Discussion of requirements: Per-Person-Per-Site Storage of Privacy! Assuming that employee-access privileges are specified in the policy, the user need not fear “errant browsing” [inside jobs can never be stopped] “Sharing” with “affiliated parties” is not a relevant problem when specified in advance with the caveat of “no sharing with less-restrictive policies” – or even “no sharing” Next-Generation Infrastructure:HTTP End-To-End
IPV6 Provides End-To-End Security IPV6 Also Gives better network-management IPSec and Kerberos authentication can become used much more often. “Spoofing” becomes almost impossible No Fragmentation: Prevention of common buffer overflow exploits, DDOS attacks, etc. Next-Generation Infrastructure:IPV6
This is the last word on pop-ups, ever. Disable them in the browser or require the P4P policy to assure that the site does not serve pages that contain code for pop-ups. AGAIN: POP UPS ARE MORE OF A NUISANCE THAN A THREAT TO PRIVACY! Next-Generation Infrastructure:JavaScript/EMCA Script
With trusted computing, strict hardware enforcement can substitute for much of the current user-policing requirement. P4P Policies can also specify types of applets/controls that may be served. Next-Generation Infrastructure:Applets/ActiveX Controls
Scrap system entirely, run legacy system in parallel. Two mailboxes, one filled with junk, one not. User must [encrypted] authenticate themselves. Mail servers will only accept ‘trusted mail’ from a server which it can verify is only accepting mail from its own authenticated clients. Advertising/Spam is a social problem – there will not be a 100% technical solution. Requiring authenticated email provides enough accountability for human intervention to finish the job. Next-Generation Infrastructure:New Mail Structure
Two Desires: Seclusion and Anonymity Anonymity Hard to justify in the face of child pornography, national security, illegal activities. Desirable to protect freedom of speech, freedom of expression, freedom of religion, etc. Anonymity Seclusion Websites cannot track you for purposes of gathering statistical data. Don’t want to be in a customer database. Trusted Proxies
The only ways to achieve either are via connected-dial up or VPN connections to the proxy. [Must be encrypted if not dialup] Discussion of tradeoffs: Dialup VPN [Both offered by Anonymizer] Trusted Proxies
Discussion of Paper Most offerings already available Difficulties/Realities of providing Anonymous Discount Coupons (Easy) Anonymous Delivery Services (Hard) Anonymous Browsing Services (Easy) Case Study: iPrivacy
iPrivacy Merchant iPrivacy Credit Card Company iPrivacy Delivery Agent One-Way Hash or Constant Lookup? If algorithmic, then can be computed offline. If constant lookup, then iPrivacy must retain that info. Case Study: iPrivacyConnections
iPrivacy is “traceable” but anonymous Discussion of doublespeak Who will police iPrivacy and its employees? Getting the ball rolling (barriers to entry) Must be used by major credit card firms (isn’t) Must be associated with many merchants (isn’t) Must be used by many consumers to be viable for above two. [Chicken and Egg Problem] Case Study: iPrivacyStatements
What DOES iPrivacy really provide that is novel? What are the implications of a one-way-mirror? What are the advantages of having an integrated service [iPrivacy] over non-integrated services [using Anonymizer, Yahoo, Google Toolbar in combination] What are the disadvantages of having an integrated service over non-integrated services? [SPOF, etc] Case Study: iPrivacyDiscussion
The trade off of Privacy and Convenience remains. (“The website does not meet your P4P….”) A fundamental shift is required to ensure “user control” of data – be it sensitive or otherwise. PETs can only “hack” a limited amount of control into the current architecture – and only the semblance of true control. [JetBlue redux] Despite a number of concerns against “Trusted” computing – it does solve a number of problems. New Problems: Randomized attestation, hijacking port connections on a local machine. Low-level packet sniffing. [Introducing errors into the JVM via magnetic fields.] Conclusions