70 likes | 185 Views
It’s confidential. How to avoid the Brickwall of Confidentiality when linking micro data European Conference on Quality in Official Statistics, Vienna 2014 Jon Mortensen (jmo@dst.dk) and Søren Burman (sbu@dst.dk). Outline. Background The Brickwall of Confidentiality
E N D
It’s confidential How to avoid the Brickwall of Confidentiality when linking micro data European Conference on Quality in Official Statistics, Vienna 2014 Jon Mortensen (jmo@dst.dk) andSørenBurman (sbu@dst.dk)
Outline • Background • The Brickwall of Confidentiality • Challenges with confidentiality • The TEC and S-TEC pilot studies • Conclusion
Background • The paper is a specific response to developments toward very (we say, overly) detailed tables in the Trade by Enterprise Characteristics (TEC) and its service equivalent S-TEC • But we think the conclusions are generalizable to issues other than trade • Written by two compilers as ”a voice from the floor” so basically “air venting” from two middle-aged grumpy men
The Brickwall of Confidentiality • The disclosure /confidentiality dilemma: • Access to high-quality data >< maintain confidentiality • Why confidentiality? • Legal and ethical concerns, and Trust • Bottom line: The brickwall ensures that disclosed data is of high quality
Challenges with confidentiality • Identifying ”risky” cells is relatively easy • Ensuring that they are sufficiently concealed is the challenge • Number of ways to suppress a single cell is V-1 * H-1 • The difficulty of applying optimal secondary confidentiality increases with level of detail • Automated process is often not up to the task
The TEC and S-TEC pilot studies • A leap in the detail level compared to the established TEC tables • More dimensions and higher detail level • Pilot tables suffer from extensive confidentiality issues • Sometimes more than half of the trade value is suppressed • In the end: Many hours used in producing and subsequently “destroying” the tables
Conclusion • More information from existing data is a win-win, but… • …highly disaggregated, multidimensional tables are neither cost-effective nor useful due to confidentiality • We suggest: • Less detailed, preferably two-dimensional tables • Cross-border access to micro-data