140 likes | 246 Views
Data Monitoring Confidentiality and the Grid. Mark Elliot Confidentiality And Privacy Group (www.ccsr.ac.uk/capri) University of Manchester. Overview. Data Data Everywhere…. The Grid and its potential New confidentiality problems and opportunities Data Environment Analysis.
E N D
Data Monitoring Confidentialityand the Grid Mark Elliot Confidentiality And Privacy Group (www.ccsr.ac.uk/capri) University of Manchester
Overview • Data Data Everywhere…. • The Grid and its potential • New confidentiality problems and opportunities • Data Environment Analysis
Data Data Everywhere… • Massive and exponential increase in data; Mackey and Purdam(2002); Purdam and Elliot(2002). • These studies have led to the setting up of the data monitoring service. • Singer(1999) noted three behavioural tendencies: • Collect more information on each population unit • Replace aggregate data with person specific databases • Given the opportunity collect personal information • Purdam and Elliot add: • Link data whenever you can
The Grid • “Integrated infrastructure for high-performance distributed computation” Cannataro and Talia (2002) • Grid middleware handles the technical issues communication, security, access/authentication etc… Cole et al (2002) • Data grid • Knowledge grid
A Blurring of Concepts • The boundaries between data and processes become less distinct • Non-static datasets • One persons output is another person’s data
Combining and Enhancing Data • Record linkage • Data fusion • Simulation • Verification • Of data • Of output
Data Mining and the Grid • Traditional Data Mining examines and identifies patterns on single (if massive) datasets. • But Data Mining is really a method/ approach/ technology that has been waiting for the grid to happen. Multi dataset mining is now becoming a reality.
Agents • AI concept • Active programs capable of directed ‘intelligent’ search and manipulation. Web crawlers • Building blocks of dynamic grid?
A Look Over the Horizon • Absolute Seamlesness. • The ability to sit at a computer/terminal and request the information one requires. • In natural language. • Real-time dynamic modelling and simulation.
But………… • Human issues • Closer to artificial consciousness • Admit machines into our moral universe • Technological Interdependence • Confidentiality and privacy
Confidentiality issues and opportunities • Data Linkage increases disclosure risk BUT • Indirect Data Access allows a new method of controlling disclosure and increase analytical power.
Tentative Architecture for complete system for disclosure control in remote access systems. Firewall Raw Data PRE-ACCESS DQI Monitor PRE-ACCESS SDRA/SDC Treated Data PRE-Output DQI Monitor PRE-OUTPUT SDRA/SDC Data Intrusionsentry Analytical Requests Analytical Output
Data Environment Analysis • Need to move with the technology from: • One shot analyses of individual datasets • Ongoing analyses of the data environment • The question is Not how safe is my data but how disclosive is the data environment. • A process of data monitoring is one aspect of this.
What sort of society? • Informational Transparency? • Human- Computer Interdependence? • Individualism vs Collectivism • A choice: • More legislation or less? • Personal information a commodity or public good