80 likes | 90 Views
Explore the evolution of disclosure control strategies for the 2001 Census in the UK, including setting output area sizes, record swapping, and small cell adjustment. Learn about challenges, solutions, and the future of data confidentiality.
E N D
2001 Census Disclosure ControlUK variations Frank ThomasGROS
Disclosure control for 2001Scotland • Setting a target or average size for output areas (50 households) • Setting a minimum size of areas for key output (e.g. 20 households and 50 residents for Census Area Statistics) • Creating only one set of output areas • Limiting the detail in classifications used in tables • Record swapping before tabulation • Small Cell Adjustment (workplace tables) Red: UK differences
ONS decided to have an average size of around 120 households • Average size for Scottish 1991 output areas was around 55 households So • There would have been much discontinuity in Census geography in Scotland(not a consideration in E&W, NI)
ONS increased minimum size of areas (from 20 households to 40) • Not much benefiteg lone Chinese household still a lone Chinese household FOR • Much discontinuity in geogover 10% OAs would need to be merged
Small cell adjustment • Record swapping => intruder can’t be sure BUT • ONS worried about perception ofidentification (1s) BUT • Actual disclosure a matter of 0s not 1s BUT • SCA increases the perceived number of 0s Decreasing perception of identification increases perception of disclosureANDUpsets users
What about 2011?(personal view) • Geographical continuity • No SCA • No record swapping
What's wrong with record swapping? It is ineffective for • population bases other than the geographical variable being swapped • Had to use SCA for workplace tables • populations for geographies within which records are swapped. • Population uniques still at risk in SARs
Use Over-imputation • It won't amend the area of residence/enumeration (but we could do record swapping as well perhaps) BUT • It can focus on particular areas or variables • It can be pegged back a bit for areas or variables where other processing has wrought much change in the data as collected. • It is better than record swapping fornon-household populations