180 likes | 271 Views
The 2011 Census: Estimating the Population Alexa Courtney. Overview. Background New topics on questionnaire Internet Data Collection Delivery of data Overview of 2011 processing Discuss “downstream” processes. 2011 Census. 27 th March 2011 Census Rehearsal: 11 th October 2009
E N D
The 2011 Census: Estimating the Population Alexa Courtney
Overview • Background • New topics on questionnaire • Internet Data Collection • Delivery of data • Overview of 2011 processing • Discuss “downstream” processes
2011 Census • 27th March 2011 • Census Rehearsal: 11th October 2009 • Most complex UK Census • More questions and topics than previous censuses • Range of delivery and completion options • Similar to 2001 – keep what worked • Some new/modified methodologies • Operational and statistical • Wide range of outputs
Questionnaire • New topics • Citizenship • Second address • National identity • Language • Full census returns from short-term migrants • In UK for 3 months or more • Identified through intention to stay question
Questionnaire completion • Internet completion being offered for first time • Internet Access Code provided on front of paper questionnaire • Offers opportunities to improve data quality and reduce respondent burden • Automatic routing • Validation rules • Use of radio buttons • No unnecessary changes from paper questionnaire to minimise modal bias • Advantages and disadvantages • Reduces amount of editing required • Increases possibility of multiple responses
Data delivery • Can be split into three groups • Questionnaires returned within 6 weeks of Census day • Majority of data • Fully processed across UK • Matched to CCS • Questionnaires returned within 10 weeks of Census day • Fully processed in England, Wales & Northern Ireland • Questionnaires returned more than 10 weeks after Census day • May be used in coverage adjustment
Removing false persons • Problem identified in 2001 Census • Records created in error • Pages crossed out • Dust on scanner • “Two of Five” rule • Name (from individual questions) or Date of Birth AND • One of: Name (from individual questions), Date of Birth, Sex, Marital Status, or Name (from household members table) • Important for data quality and matching
Multiple response resolution • Overcount • Several types of multiple response • Two questionnaires from same household • Two paper questionnaires • Paper and Internet • Person on same questionnaire twice (or more!) • Person on Household and Individual questionnaire • Person on Household and Internet questionnaire • Needs to be a quick process
Multiple response resolution • Duplicate households identified when receipted • Questionnaire tracking for England, Wales & Northern Ireland • Matched questionnaire IDs and address in Scotland • Resolved by matching people within household • Key variables: Name (or soundex), Date of Birth, Sex • If Age <30, name must match exactly • Minimise risk of matching twins • If no people match, two household records created • If any people match, questionnaires merged
Multiple response resolution • Merging questionnaires • “Most complete” response kept • Missing variables copied from duplicate record(s) • Priority given to individual questionnaires • Process for within postcode multiples • People completing neighbour’s questionnaire • Similar principles for resolution
Filter Rules • Based on 2001 Rules • Used to identify incorrect/unnecessary responses • Deterministic – based on other responses • Used to prepare data for main edit & imputation • e.g. Person aged <16, economically inactive (student) • e.g. Person employed, not looking for work
Edit and Imputation • Will use CANCEIS system • Resolves inconsistent data • Probabilistic • Programmed with all possible inconsistencies • Impute missing data • Based on complete records • Searches for similar donor • Ensures complete and consistent data
Output flags • Non-standard outputs possible for England & Wales • Use information on Second Residences • Population staying in UK 3-12 months identified • Exclusion from standard outputs • Production of specific outputs • England, Wales and Northern Ireland only • Considering including this population in coverage adjustment • Mark records now to enable easy production of these outputs
Census Coverage Survey 2011 Census Matching Quality Assurance Estimation Adjustment Coverage assessment process
Disclosure Control - Options • Necessary to protect confidentiality of respondents • Three options were short-listed: • Pre-tabular: • Record swapping (pre-tabular) • Small number of records swapped across areas • Adds uncertainty to “unique” records • Over-imputation (pre-tabular) • Some variables deleted and re-imputed • Post-tabular: • Invariant ABS Cell Perturbation (IACP) • Small counts can be altered • Two stage process to ensure “additivity”
Disclosure Control – Chosen Methodology • Pre-tabular method recommended • User preference for consistency between tables • IACP method rejected • Record swapping chosen instead of over-imputation • No persons or data items removed • Outputs at national level and high geographies unaffected
Outputs • Main base will be Usual Residents • All people living in UK for 12 months or more • Consistent across UK • First outputs – September 2012 • Other standard outputs by Spring 2013 • ONS producing non-standard outputs • e.g. Weekday population, Majority of time • Consultation to decide exactly what • Outputs on short-term migrant population • All people living in UK for 3-12 months • England, Wales and Northern Ireland