1 / 20

Enhancing Census Accuracy: 2011 Survey Improvements Review

Explore key changes and methodology updates for improving UK census estimates. Learn about coverage assessment strategies, overcount analysis, and bias adjustments for more precise data.

lacher
Download Presentation

Enhancing Census Accuracy: 2011 Survey Improvements Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT

  2. AGENDA • Background • Coverage in the 2001 Census • 2011 Methodology overview • Key changes • Summary

  3. WHAT IS THE PROBLEM? • Despite best efforts, census won’t count every household or person • It will also count some people twice • Users need robust census estimates - counts not enough • In 2001: • One Number Census (ONC) methodology was developed to measure undercount • estimated 1.5 million households missed • 3 million persons missed (most from the missing households but some from counted households) • Subsequent studies estimated a further 0.3 million missed • In 2011 we want to build on the ONC, as broadly it was successful

  4. 2001 CENSUS UNDERCOUNT BY AGE-SEX

  5. RESPONSE RATES BY LOCAL AUTHORITY

  6. Census Coverage Survey 2011 Census Matching Quality Assurance Estimation Adjustment COVERAGE ASSESSMENT PROCESS OVERVIEW

  7. AREAS OF IMPROVEMENT • Elements of CCS Design • Estimation methodology • Measuring overcount • Adjustments for bias in DSE • Imputation • Motivated by: • lessons learnt from 2001 • 2011 Census design e.g. use of internet

  8. THE CCS DESIGN • Similar to 2001 CCS: • 300,000 Households • Sample of small areas (postcodes) • 6 weeks after Census Day • Fieldwork almost identical • Improvements: • Designed at LA level, not for LA groups • Refined Hard to Count index (5 levels) using up to date data sources • Use Output Areas as PSUs • Select 3 postcodes per OA • Revised allocation of sample (using 2001 patterns)

  9. THE CCS DESIGN (2) • What does this mean? • Each LA will have its own sample – at least 1 OA for each hard to count level • Sample is more skewed to LAs with ‘hardest to count’ populations (with an upper limit of 60 OAs) • More LAs will have estimates based on their own data • Especially in London and for big cities • HtC index will be ‘up to date’ • Most LAs will have 3 HtC levels • Most London areas only had one in 2001 • Looking at a 40%, 40%, 10%, 8%, 2% distribution

  10. ESTIMATION • Obtained lots of data from 2001 to be able to explore whether improvements can be made • One key issue was whether we should group LAs by geography or by ‘type’ • Improvements: • Confirmed that using DSE at OA level is sensible • Confirmed that we should group LAs by geography • Use simple Ratio estimator • Confirmed that LA estimation method is still best

  11. ESTIMATION (2) • What does this mean? • The estimation methodology is much the same as it was • Should be slightly easier to explain • We will group LAs that don’t have enough sample with their neighbours until that group has enough sample • More LAs will have enough sample to produce direct estimates

  12. OVERCOUNT • In 2001, estimated around 0.4% overcount (duplication) • No adjustments made • Not integrated into methodology • For 2011, expecting overcount to be higher • More complex population • Use of internet in 2011 Census • Strategy is to: • A) identify and remove obvious cases (multiple response resolution) • B) measure and make net adjustments on the remainder • i.e. for the latter we are NOT removing duplicates

  13. OVERCOUNT (2) • Methodology: • Select targeted samples of census records • Second residences • Students • Children • Very large sample (~600,000k records) • Automatic matching algorithm to identify duplicates • Clerical checking of matches • expect to see ~13,000 duplicates • Also use the LS to QA the estimates • Estimation of duplication rates by GOR and characteristics • estimating which is the correct record • Why not do whole database and remove them? • High risk of making false positives and thus removing too many!

  14. OVERCOUNT (3) • What does this mean? • Population estimates will be reduced where there is overcount • We will be able to say how much adjustment was made due to overcount • The duplicates will still be in the data, we just won’t impute as much for undercount

  15. DSE BIAS ADJUSTMENTS • Assumptions underpinning DSE: • Homogeneity • Independence • Accurate Matching • Closure • DSEs usually have some bias, mostly due to failure of homogeneity assumption • In 2001 Census we made a ‘dependence’ adjustment • This showed that we need to have a strategy for measuring this

  16. DSE BIAS ADJUSTMENTS (2) • Mitigate as much as possible: • Post-stratify DSE so heterogeneity is minimised • Independence in CCS field processes • Design Matching to get accuracy • Collect CCS on same basis as Census • Measure remaining bias • Specific adjustments – e.g. Movers, Overcount • Residual biases global adjustment • Improved adjustment using Census address register • Looking at improving age-sex distribution

  17. DSE BIAS ADJUSTMENTS (3) • What does this mean? • We will be making adjustments to the estimates based on plausible external data • Household counts • Sex ratios • This will be part of the methodology • Also can be used if QA determines estimates are implausible

  18. COVERAGE ADJUSTMENT • Imputation methodology had problems converging • Sometimes resulted in poor quality results • Improvements: • Model characteristics at higher geographies • Allows more details to be modelled • Some additional topics in the CCS included in models: • Migration variable (internal, international) • Country of birth (UK and non-UK) • Non-controlled variables imputed by CANCEIS • What does this mean? • Better Imputation quality • Characteristics of imputed improved

  19. SUMMARY • Coverage assessment is an integral part of the 2011 Census • It will again define the key census outputs (estimates at LA level by age and sex) and adjust the database • We learnt a lot of lessons in 2001 and have been working to address them

  20. Questions? owen.abbott@ons.gov.uk

More Related