400 likes | 511 Views
‘Big Data’ and the Challenge to Informed Consent as a Basis for Privacy Protection Talk for IEEE and CLPC, 22 May 2014, UNSW. David Vaile Co-convenor Cyberspace Law and Policy Community Faculty of Law, University of New South Wales http://cyberlawcentre.org/2014/IEEE/.
E N D
‘Big Data’ and the Challenge to Informed Consent as a Basis for Privacy ProtectionTalk for IEEE and CLPC, 22 May 2014, UNSW David Vaile Co-convenor Cyberspace Law and Policy Community Faculty of Law, University of New South Wales http://cyberlawcentre.org/2014/IEEE/
Challenges for consent Outline About Big Data, Consent Big Data Distinguishing characteristics Context Good and bad consent Zombie consent Difficulties with scale Need for consent rejected? No purpose, causation? Manipulation of consent? Lessons
Welcome • I’ll give a talk touching on Consent issues raised by Big Data. It’s first iteration, feedback welcome! • Lyria Bennett Moses will respond, and add observations from her research in technology regulation • Holly Raiche will explain the impact of the recent EU court decision which threw out the Data Retention directive • Questions of fact or clarification are OK in the talk or after, Main discussion at the end?
What is ‘Big Data’, after the Hype Cycle? • Partly hype and marketing, but real differences beyond scale • Many facets • A Technology, or combination of data and functionality, with certain technical features and characteristics • A ‘Frame’ or brand, a ‘Meme’ with its own rhetorical character and assumptions • Some of the key relevant uses and tools came from marketers with software engineering genius: • Google (core MapReduce tool) • Facebook (now reinventing Big Data data centre hardware)
Big Data as technology: distinguishing features cf. old dBs • PR: Velocity, volume, variety, variability, value • Huge, fast/near real time, heterogeneous… • Omnivorous: • Complete data set, not a sample • Every data set, not just one • All data types, not just obvious records • Adaptable: metadata as well as content • Low integrity: data need not be accurate, current, fit • Purposeless: no need for prior purpose to be designed in • Association not causation
Omnivorous, hungry for data for its own sake? • When “too much data is never enough”? • Many methods based on access to every record in a data set, not a sample or slice • Data takeup is very flexible, so more data sets are low cost to ingest, and thus attractive compared • Can work on both metadata and content data (in comms terms), doing pattern recognition on say movement and photograph
Dirty data is OK… until you send in the drones • There are clever means for dealing with both incomplete data and dirty, incorrect data • This is OK for some purposes (weather) but potentially not where an individual is identified and targeted for individual treatment • The less risky end of this is marketing: if dirty data means that some ads are a few % less persuasive than otherwise, little is lost. Key tools and uses came from this industry, or those with no Personal info link. • However, personally serious outcomes such as being refused health insurance at a viable price, or becoming a drone target,
‘Purposeless’: Outputs not designed in, self-modifying rules? • NOT: collection for a purpose, limited to that purpose, destroyed when purpose is over, stored in a silo secure for that purpose. [Ebay] • Assumption of a ‘fishing expedition’: something will come up, some new association, new insight, cannot pre-specify • ‘We want everything because we want everything…’ • Machine learning from any given data set, generates own rules, new associations • Exploitation of new functionality using old collections of data • Prefers longitudinal data retention in lakes to transient silos
Expertise not applicable? (after a certain point) • Is expertise superseded by Big Data systems (if they work) • The scale of data is beyond human comprehension or analysis • The Algorithms similarly: Machine learning rules are written by machines, not programmers, using scale and probabilistic inputs which are beyond our ken • A good Big Data system is iteratively self improving (if you have the feedback correct), so may get better than any expert • At this point the expert’s view of what it is doing may become unreliable, and any possibility of auditing or correction lost • Deus ex Machina? Computer says no? Black box must be obeyed?
Context: Ask Forgiveness not Permission • Meme arising from early days of IT: Grace Hopper? • Chips Ahoy magazine, US Navy, July 1986, • Appropriate for fast, ‘Agile’, ‘Extreme’ software development to bypass bureaucracy • Assumes truly ‘disposable’, ‘throwaway’ prototypes. Fixed by v2 • Also works for innovative business models, where failure is OK, test limits • FAIL: for personally significant information: v2 does not help the victim of unintended disclosure, publication or exposure • FAIL: if there is no effective enforcement (FB wrist slap 2011)
Context: ‘Cult of Disruption’ in key data driven firms • ‘Forgiveness not permission’ (Google, many others) • ‘Move fast and break things’ (FB, fudged last week) • Attractive to small start-ups and to the online giants • Often implies the key disruption is cool new technology • But often also relies on traditional risk-shifting, cost-evading and side-stepping obligations • Reluctance to accept obligations re Tax (Google, Apple…), Insurance (Uber), Wages, License fees, Compliance, etc. • Essentially inimical to idea of compliance
Consent • One legal basis for data processing is “freely-given, unambiguous and informed consent of the data subject to the specific processing operation.” • Article 2 (h), EU Data Protection Directive • Consent also works as the basis for entry into a contract • Consumer protection recognises contract law is often unfair to consumers because of gross disparity of knowledge and bargaining power with a big business • Precautionary Principle: if there is compelling info to suggest a path has an irrevocable step into a situation with real risk of serious harm, don’t proceed until you can clarify the risk and know it is OK.
Good and bad consent (thinking as a subject) • Informed, not ignorant (info suits your needs) • Unbundled, not bundled (holding you hostage to something essential, all or nothing) • Before the fact, not after • Explicit not implied • Revocable not permanent – this is your insurance (Google likes to think you get a chance to say yes, until you do, and no way back)
Consent needs proper information to reveal the ‘price’ • A business assesses ‘cost and risk’ against benefit • Due diligence needs specific info to to work out who to trust • You need info to help you appreciate risks, not just benefits, and assess probability and impact • Different people at different times need diff. info - Not beyond the power of Big Data firms • Potential reluctance to be specific, across the board: Information asymmetry: they know you, but not the reverse
Zombie consent: click that nice blue button • Many consumers, given little real choice, bundled consent, confusing and meaningless info just click the online consent button • Trained like rats or birds to click the button to get the reward • It says: “I have read and understood and agree” • It means: “ I haven’t read and couldn’t understand, whatever” • The role of consent may be limited by both consumer behaviour (lying about their agreement) and the complicity of operators (who could offer
Recent developments • US: two reports to Obama – minimal consideration of consent • EU • ECJ ruling invalidating data protection directive - Holly • EDSP report, Privacy and competitiveness in the age of big data, March 2014 • ECJ ruling requiring Google to offer a ‘right to be forgotten’, Spanish bankruptcy – spent convictions model – revocation?
Vast aggregations are difficult to explain for consent purposes • The complexity and extent of the functionality may present issues, especially if there is no constraint on use or purpose • But it could be done… If it mattered • Google is a master of translating complexity to comprehensible chunks • Data visualisation could help, key big data tool • Conscious decision not to try, to seek obfuscation? • Reluctance to accept transparency? Hiding behind complexity
Claims it’s too hard, Privacy is over, Consent irrelevant • From of the cult of Disruption: We are new, fast, smart, cool, so just get out of the way! • Respecting your wishes would cramp our style, so don’t make us ask • (Real issue: we don’t want to have to obey a refusal to consent) • Bundled consent: if we have to ask consent, ‘the terrorists will win’, or ‘you won’t have any friends’, or ‘no new toys’ • Is this a real objection, or framing the question to get No? • Potential reluctance to learn from e-commerce, micro-transactions, Bitcoin, other new technologies, or even Big Data itself?
Consent v. Unequal bargaining power of Big Data ops • Have we stepped back into a contract-first world, before consumer protection stepped in to redress the imbalances? • Unilateral, non-negotiable, incomplete contracts • Swedish Data Protection Board 2013: Google refuses to negotiate on a contract that omits key data about who, where and for what purposes your personal info can be used • Compliance impossible to ascertain • So: Not suitable to sign! • The absence of key information is presented as a bluff. The Swedes called it, everyone else takes the sucker’s option • Role for consumer protection law to redress the balance?
‘Forgiveness, not permission’ = No consent? • The ‘forgiveness’ slogan appears to be fundamentally hostile in principle to idea that the data subject might have the prior right over what is done with their data – possession? • Conflates external regulation with personal permission and consent • Permission in this case is permission from the individual in the form of informed consent • Forgiveness often is sought from other than the affected subject, or only sought if caught • Hostility to any form of prior permission seeking? • When consent is reluctantly sought, it is formalistic not aimed at enabling due diligence or real understanding
Association not Causation: should you ever consent to this? • Association, uncertainty, incompleteness, out of dateness, inapplicability to the purpose may all not be fatal flaws for the original task of marketing tweaks • But as soon as real decisions and risks are linked to individual, the reality that possibly random associations are at the core, • not falsifiable evidence-based understanding of a true deep causal connection • Raises questions about whether anyone should be expected to accept this level of uncertainty • Especially when the means for auditing or verification or correction are absent
No purpose = No information for consent? • OECD Principles-based Privacy law is based on permitting any reasonable use of your personal data, not getting in the way of specific necessary tasks • But it assumes you must be told the purpose for a collection, use and/or disclosure • This is so you understand what it’s for, and can decline if you are not happy with that purpose or use (even at some cost) • Search warrants are also issued for a specific purpose, and not for ‘fishing expedition’ • Big data purposes are often made up as you go along, precisely a ‘fishing expedition’ with machine learning and new associations, re-identification, new algorithms
Consent v. Deep understanding of what’s in your head • Psychographic profiling aims to ‘get inside your head’ by extracting insights from associations from data surveillance • A/B testing and other techniques used to refine understanding of all the factors which affect choice to clicking ‘yes’ • Capacity to understand you, predict your behaviour or reactions • Capacity to persuade you, find neuro-linguistic keys to you • Capacity to frame a message irresistible to YOU • Flies under the radar, like subliminal advertising (illegal manipulation) • Potentially undermines basis for real consent?
Lessons • Too early to tell - real challenges for consent from Big Data? • Some may arise from the technology, or the business model • But some from the old-fashioned ‘cult of disruption’: uses technology to distract from unwillingness to meet obligations • Awareness of implications and risks is hard • There is a reluctance to assist understanding of this: denial, obfuscation, missing info, incomplete contracts • A poor basis for consumer friendly negotiation? • A poor basis for trust?
Questions? David Vaile Cyberspace Law and Policy CommunityFaculty of Law, UNSW http://cyberlawcentre.org/2014/IEEE/ d.vaile@unsw.edu.au 0414 731 249