180 likes | 300 Views
Automated Tracking of Online Service Policies. J. Trent Adams 1 Kevin Bauer 2 Asa Hardcastle 3 Dirk Grunwald 2 Douglas Sicker 2. 1 The Internet Society 2 University of Colorado 3 OpenLiberty.org. 38th Research Conference on Communication, Information and Internet Policy.
E N D
Automated Tracking of Online Service Policies J. Trent Adams1Kevin Bauer2Asa Hardcastle3 Dirk Grunwald2 Douglas Sicker2 1 The Internet Society 2University of Colorado 3OpenLiberty.org 38th Research Conference on Communication, Information and Internet Policy
What They Know Possible medical conditions Social relationships Offline behaviors Personal interests Search queries Web browsing habits Shopping habits Financial status TPRC 2010: Automated Tracking of Online Service Policies
User Tracking is Easy and Common Additional tracking elements: Sites often embed cookies and other tools to explicitly identify and track users When a user visits a website… Website dictionary.reference.com Implicit information revealed: IP address HTTP request headers (user-agent, operating system, local time and language, referrer) This information alone can be used to construct an identifying, trackable profile [EFF’sPanopticlick, PETS ’10] Source: http://blogs.wsj.com/wtk TPRC 2010: Automated Tracking of Online Service Policies
The Need for Clear Policy Articulation Pros of natural language policies Given the inherent privacy risks in ordinary web browsing, most sites explicitly explain how they handle sensitive user data (PII) in a human-readable, natural language privacy policy or terms of service document Near universal deployment Cons of natural language policies Users must find, read, and comprehend the policies Comprehension is poor for natural language policies [McDonald et al., PETS ’09] TPRC 2010: Automated Tracking of Online Service Policies
Structured Policy Formats: P3P • The Platform for Privacy Preferences (P3P) is a machine-readable XML schema for encoding: • What kind of user information is collected • How any collected user information is used • How long user information is stored • P3P files can be automatically parsed and semantically analyzed by the web browser • Users can specify their own preferences and interact only with sites with compatible policies • Policy information can be transformed into “standardized” formats to improve policy comprehension TPRC 2010: Automated Tracking of Online Service Policies
P3P and Standardized Policy Formats Structured policy formats (like P3P) can be summarized and displayed to users in standardized, easy to read formats ≈ . . . “Privacy Finder” P3P Search Engine Result TPRC 2010: Automated Tracking of Online Service Policies
Slow Adoption for P3P A study by Cranoret al. found that the most popular web sites tend to be more likely to offer P3P, but overall deployment is very low 2006: Only 10.25% offer P3P 2008: Only 13.59% offer P3P Source: Cranoret al., Electronic Commerce Research and Applications 2008 TPRC 2010: Automated Tracking of Online Service Policies
Our Goal: Make Interacting with Natural Language Policies Easier P3P adoption is limited, but human-readable policies are prevalent This is a stop-gap measure: Until a structured policy format is widely adopted, we must interact with natural language policies … New structured policy format? Natural language policy tracking P3P Our contribution: Design and implement Policy Audit System - Aggregates natural language policies for a wide variety of websites - Periodically checks these policy documents for updates - Enables distribution of policies to interested users - Notifies users about specific changes in policies TPRC 2010: Automated Tracking of Online Service Policies
Policy Audit System: Architecture Key Components: - Policy Monitor: Periodically fetches known policy documents for a large set of websites; checks policies for changes -Policy Library: The collection of policy documents for each site over time -Policy Library Mirrors: Copies of the policy library hosted by third parties -Clients: Offers a way for users to obtain current or past policy information TPRC 2010: Automated Tracking of Online Service Policies
Policy Monitor • Periodically fetches a set of policy document URLs • Extracts relevant policy text using standard text parsing techniques • Compares the latest version to previously seen version to detect changes • Records latest version (if changed) • Based on the EFF’sTOSBack service (http://www.tosback.org) TPRC 2010: Automated Tracking of Online Service Policies
Policy Library • The Policy Monitor produces a library of policy documents, as they change over time • The Policy Library is a directory structure available via the web: • A list of tracked web websites • Policy text snapshots, or previous versions • Various metadata to help find the latest document version • The master library is hosted by the University of Colorado • Currently tracking 76 distinct policies (more coming soon) TPRC 2010: Automated Tracking of Online Service Policies
Policy Library Mirrors • Policy Library copies that are distributed among trusted parties • The Electronic Frontier Foundation (EFF), the Center for Democracy and Technology (CDT), and the University of Colorado host Policy Library mirrors TPRC 2010: Automated Tracking of Online Service Policies
Clients • Generically, a client offers an interface to the Policy Library, providing access to policy data • A client could offer the ability to search the library, automate change notification via twitter, ATOM, RSS, or e-mail • We developed a client as a Firefox plugin that displays policy information (and notification of changes) for the current site the user is visiting TPRC 2010: Automated Tracking of Online Service Policies
Example Client: Firefox Browser Plug-in* • Accesses the Policy Library and alerts the user when they visit a website that publishes a policy that the Policy Monitor is tracking Visiting a site that’s not tracked Visiting a tracked site with an unread policy Visiting a tracked site with an updated policy since last visit Visiting a tracked site, but no change in policy since last visit Alert Icons * sponsored by TPRC 2010: Automated Tracking of Online Service Policies
Plug-in: Visiting a Tracked Site Menu lists tracked policies TPRC 2010: Automated Tracking of Online Service Policies
Plug-in: Visiting a Tracked Site with Policy Changes TPRC 2010: Automated Tracking of Online Service Policies
Plug-in: Discovering Third Party Information Disclosure Current policies for a visited page www.apple.com/itunes Notify user of third-party page elements TPRC 2010: Automated Tracking of Online Service Policies
Summary and Conclusion • Given the absence of a widely adopted structured policy format, we argue that steps should be taken to make natural language policies easier for users to understand • To this end, we present the Policy Audit System to track natural language policy documents and notify users of policy updates • Our hope is that this work helps individuals make sense of natural language policies while we wait for a structured policy data format to be widely adopted For more information Project overview: http://www.policymonitor.org/about Development community: http://www.policymonitor.org/sourcecode Firefox plug-in download: http://www.policymonitor.org/auditplugin Thank you kevin.bauer@colorado.edu TPRC 2010: Automated Tracking of Online Service Policies