1 / 19

Benefits and Risks of De-Identification in Digital Marketing

Explore the benefits, methodologies, and risks of de-identification in digital marketing. Learn how it supports research, analytics, security, product development, and marketing. Understand the FTC standard and the importance of context and cell size thresholds.

aviles
Download Presentation

Benefits and Risks of De-Identification in Digital Marketing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DIGITAL MARKETING and De-Identification (benefits, methodologies, and risks) October 13, 2016

  2. Future of Privacy Forum • The Future of Privacy Forum is a non-profit organization that serves as a catalyst for privacy leadership and scholarship, advancing principled data practices in support of emerging technologies. • Jules Polonetsky, CEO julespol@fpf.org Stacey Gray, Policy Counsel sgray@fpf.org • www.fpf.org • facebook.com/futureofprivacy • @futureofprivacy

  3. Traditional Cookie Model Strings of data stored by browser & requested by servers when a user visits a website • Includes a maximum expiration time • User can clear the data at any time (all or some)

  4. Offline Data Appending

  5. Understanding Ad Effectiveness

  6. Crumbling Cookies The cookie is increasingly ineffective because: Cookies can only identify a user within the same browser Increasing % of browsing is on mobile browsers (many do not process cookies) Increasing % of web behavior occurs in apps Consumers today access the web via an expanding array of devices and platforms: Platforms •Operating Systems •Browsers •App Stores •Ad Networks •Social Plug-Ins •Analytics Devices •Phones •Tablets •PCs •eReaders •Media Streaming / Gaming •Wearables •Virtual Reality •Home Control Consumer Software •Search Engines •Location Services •Speech Recognition •Office Suites •Email Services •Mobile Messaging •Social Networks •Cloud Services •Photos •Video / Music Players

  7. Branded data aggregators providing data to the BlueKaiExchange (Oracle Data as a Service (DaaS) for Marketing)

  8. Geo-Location Data Traditional Methods: • OS’s Location Services (aggregated sources: GPS, Wi-Fi, Cell Towers, etc.) • Nearby Cell Tower IDs (with publicly available look-up databases) • Carrier Triangulation of Cell Tower IDs (mobile OS only) Other Sources: • Nearby Wi-Fi SSIDs • Proximity Data – Bluetooth Beacons, LED Lighting, Audio Beacons, Magnetic Mapping, Accelerometer • Mobile Location Analytics (device’s MAC address)

  9. The Benefits of De-Identification • In every sector, de-identification supports beneficial uses of data with reduced privacy risk for a wide range of purposes, including: • Research • Analytics • Security • Product Development and Improvement • Marketing

  10. Substantial Benefits to a De-Identification Approach that Tracks the FTC Standard • Aggregation is only one method for de-identification; there are many other techniques that are considered acceptable by the disclosure control community (including FTC and NIST). • Process-based de-identification regulations are best suited to stand the test of time (as compared to listing specific data types as included or excluded). By requiring that data cannot be “reasonably linked” to an individual or device, the FTC’s standard requires those de-identifying to take into account evolving methods. • Devices should only be considered identifiable when the device is reasonably linkable to a specific individual (excluding, for example, a factory sensor).

  11. The NPRM’s Definition of Aggregation was Overinclusive Examples of direct identifiers: Name, address, telephone number, fax number, insurance identification number, license plate number, email address, photograph, biometrics, SSN Examples of indirect identifiers: sex, date of birth or age, geographic locations (such as zip codes, census geography, information about proximity to known or unique landmarks), language spoken at home, ethnic origin, total years of schooling, marital status, criminal history, total income, visible minority status, profession, event dates, number of children • The NPRM’s proposed definition of “Aggregation” implied aggregation of all identifiersin a data set, without making a distinction between aggregating direct identifiers vs. indirect identifiers.

  12. The NPRM’s Definition of Aggregation was Incomplete • The NPRM’s definition of aggregate customer information would not provide a relevant group size or a range, despite a large body of precedents to draw from. • This can lead to either a risk of re-identification that is too high or too low, depending on the context. • Aggregation itself must be process-based; one size does not fit all. • There are methodologies for selecting an appropriate group size.

  13. Spectrum of Cell/Group Sizes

  14. Context should be considered • For de-identification, context includes: • the security, privacy, and contractual controls that exist in relation to the data; and • who will have access to the data (e.g., a public release, controlled release, or internal use); not all datasets are released into the “wild” without any protection. • Applicable controls should be taken into account when assessing proper de-identification techniques and cell size thresholds. • De-identification is a risk management exercise.

  15. De-identification Standards & Guidelines

  16. Re-identification Risk Is Low With Proper De-Identification Methods • The re-identification attacks cited in the NPRM or by commenters to justify an “aggregated or identifiable” rule are based on incidents where: • data was not properly de-identified to any established or expert standard (e.g., re-identifying Netflix users based on movie ratings or the re-identification of individuals in the Personal Genome Project); or • data was not actually re-identified (e.g., financial transaction data study and the anonymity of home/work location pairs).

  17. Definition of Personally Identifiable Information • The proposed definition of personally identifiable information combines and confuses two different concepts: identifiability and addressability. • It is possible to have addressable information that is not identifiable. • There is a long tradition of using techniques like double blind matching or trusted third party intermediaries to support the utility of data for a range of purposes while minimizing the risk of re-identification.

  18. Distinguishing between Sensitive and Non-Sensitive Data • Privacy frameworks in U.S. and globally make this distinction: • FTC • Canada • EU General Data Protection Regulation • An FCC rule recognizing this distinction is consistent with consumer expectations. • Various uses of ISP data do not rely on the collection, use or inspection of sensitive data.

More Related