230 likes | 377 Views
Disclosure Control in Practice: issues and approaches. Andy Sutherland Health and Social Care Information Centre. Outline. Background – transparency, open data, confidentiality, Code of Practice and other requirements Basics of disclosure control Approaches used Issues Reflections.
E N D
Disclosure Control in Practice: issues and approaches Andy Sutherland Health and Social Care Information Centre
Outline • Background – transparency, open data, confidentiality, Code of Practice and other requirements • Basics of disclosure control • Approaches used • Issues • Reflections
Background • Transparency, open data • Publish in as much detail as possible • Make machine readable • Allow and encourage re-use • Confidentiality • Data Protection Act, Common Law requirements etc.
Code of Practice • Principle 5, practice 1 • “Ensure that official statistics do not reveal the identity of an individual, or any private information relating to them, taking into account other relevant sources of information.” • Principle 5, practice 4 • “Ensure that arrangements for confidentiality are sufficient to protect the privacy of individual information, but not so restrictive as to limit unduly the practical utility of official statistics.” • National Statistician’s Guidance
Other Guidance • ONS work on health http://www.ons.gov.uk/ons/guide-method/best-practice/disclosure-control-of-health-statistics/index.html • Scottish Government guidance http://www.scotland.gov.uk/Topics/Statistics/About/Methodology/Glossary • Various consultations ongoing http://www.ico.gov.uk/news/latest_news/2012/ico-consults-on-new-anonymisation-code-of-practice-31052012.aspx • DH v ICO [abortion statistics case] http://www.ico.gov.uk/foikb/PolicyLines/FOIPolicyPersonaldata-anonymisedstatistics.htm
User comment • “…Basically ONS and IC only care about disclosure control and don't give a toss as to whether data are any use to users.”
Why disclosure control is needed? • Basic revision class! • Number of A+E consultants by hospital, March 2012
Why disclosure control is needed? • Basic revision class! • Number of A+E consultants by hospital and ethnicity, March 2012
HSCIC process and approaches • 150 publications per year • Other releases • Ad-hoc queries • Data access or analysis systems • Standard risk assessment process • “Small Numbers Panel” assesses complex cases
Small Numbers Panel • Head of Profession for Statistics (Chair) • Head of Information Governance • Programme Manager, Information Services • statistical, legal and business/user input.
Issues (1) • Understanding of scope • Distinguishing cases where disclosure control is needed (“I don’t want inadvertently to release identifiable information”) from those where different legal approaches are needed (“I know this is identifiable but I need to do it anyway”).
Issues (2) • Seeing the wider context • Proposal to publish practice-level prescribing data • Legality • Level of granularity and frequency of publication • Feasibility • Costs, benefits and risks • Perverse outcomes
Issues (3) – Maternity tables • Enhanced, easier for users to interpret • Overview of main delivery types • Easy to compare (in one table) • Available as automated reports to provider level - http://www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=1815 • Unexpected consequences • More suppression due to tables within tables • ‘Unknown’ values were used for secondary suppression, these are used to calculate rates; now try to avoid using for secondary suppression.
Method of delivery (2008-09) Unable to aggregate to SHA level Unable to aggregate delivery types (e.g. Spontaneous), therefore cannot calculate rates
Method of delivery (2009-10) Unable to calculate rates as lots of ‘Unknowns’ are suppressed Able to used aggregated data (SHA level) Able to use aggregated data (Delivery types), therefore can calculate rates Rate = Spontaneous / (Total – Unknown)
Suppression Example Table D: Method of delivery – example (2009-10) • Primary suppression • All values equal to 5 or less (excluding unknowns)
Suppression Example Table D: Method of delivery – example (2009-10) • Secondary suppression • All values corresponding to primary suppressed values • Row and column, effectively four tables • ‘Other’ suppressed, therefore also ‘Unknown’ – • unable to calculate the rate
Suppression Example Table D: Method of delivery – example (2010-11) • Suppression • Similar primary and secondary suppression values • ‘Other’ no longer suppressed as not disclosive • Therefore ‘Unknown’ not suppressed, can calculate rate
Issues (4) • Blanket protocols • Can be difficult to adapt in light of changing environment, and act as a brake on wider release • Often need to suppress as a whole rather than just where disclosure is an issue • Often needed as individual manual suppression can be time consuming
Issues (5) • Implications of providing “systems” and machine readable files, rather than just reports • Allows potentially disclosive cross classifications to be produced • Standard primary and secondary suppression approach breaks down • Record swapping (cf census) is a possibility • For our less critical applications prefer a combination of primary suppression and rounding
Issues (5) • Understanding the data and risks • Clinical Audits • Classic disclosure control problem with sensitive data overlaid by incomplete (but improving) data collection. • Risk management approach likely to change in time, and may become more difficult when data is better!
Reflections • No approach is infallible – it is a matter of assessing risk • Important to consider user needs • This is one (important) component of the release process • Don’t assume more information will be more helpful! • Blanket protocols should allow some “flex” • “Jigsaw identification” remains a worry
Final word • Our approaches and their outcomes are on our website. Feel free to inspect and comment. www.ic.nhs.uk andy.sutherland@ic.nhs.uk