440 likes | 547 Views
Beyond 2011 Neil Townsend 22 May 2013. Agenda. Context & Background Why change Statistical Options The business case The need for evidence Workshop session Uses and value. Context. Census – every 10 years Population statistics Social conditions
E N D
Beyond 2011 Neil Townsend 22 May 2013
Agenda • Context & Background • Why change • Statistical Options • The business case • The need for evidence • Workshop session • Uses and value
Context • Census – every 10 years • Population statistics • Social conditions • Housing • Uses: Resource allocation • Service planning • Policy development, monitoring and review • Social research • EU regulations and duty to report to Parliament small areas + multivariate combinations
The Beyond 2011 Programme • Why change? – Why look beyond 2011? • Rapidly changing society • Evolving user requirements • New opportunities – data sharing • Traditional census – costly and infrequent?? • UK Statistics Authority to Minister for Cabinet Office • “As a Board we have been concerned about the increasing costs and difficulties of traditional Census-taking. We have therefore already instructed the ONS to work urgently on the alternatives, with the intention that the 2011 Census will be the last of its kind.”
Programme Purpose • Identify the best way to provide small area population and socio-demographic statistics in future • Provide a recommendation in September 2014 • - underpinned by full cost-benefit analysis • - & high level design for implementation • (subject to agreement) Implement the recommendation
Where are we now? • Looking at alternative ways of producing population statistics, gathering evidence • There’s not many options... • We know we can do a Census • We could do sample surveys (a Census is essentially a 100% survey) • We could use administrative data sources (combined with each other and with a survey)
Beyond 2011 – statistical options FRAME SOURCES DATA OUTPUTS ESTIMATION All National to Small Area Population Data Address CENSUS Register Population estimates Adjusting for missing data and error Coverage Assessment Quality measurement Population distribution provides weighting for attributes incl. under & over-coverage - by survey and admin data? Household structure etc Adjusting for non response bias in survey (or sources) Socio demographic Attribute Data Household Attribute estimates Communal Longitudinal data Establishments Maintained national address gazetteer – provides frame for population data & surveys Interactional Analysis E.g. TTWA
Beyond 2011 – statistical options FRAME SOURCES DATA OUTPUTS ESTIMATION All National to Small Area Population Data Address Admin Source Register Population estimates Admin Source Admin Source Adjusting for missing data and error ?? Coverage Assessment Quality measurement Population distribution provides weighting for attributes incl. under & over-coverage - by survey and admin data? Comm Source increasing later? Household structure etc Commercial sources? Adjusting for non response bias in survey (or sources) Surveys to fill gaps Socio demographic Attribute Data Household Attribute estimates Socio demographic Survey(s) Communal Longitudinal data Establishments Maintained national address gazetteer – provides frame for population data & surveys Interactional Analysis E.g. TTWA
Major statistical challenges • No population register • Data quality • Incomplete • Out of date • Duplicates / erroneous entries • Little reliable population attribute data from admin sources • Coverage survey required • Linkage required • Socio-demographic attribute survey required
Potential data sources • Population data (age and sex) • NHS Patient Register • DWP/HMRC Customer Information System • Electoral roll (> 18 yrs) • School Census (5-16 yrs) • Higher Education Statistics Agency data (Students) • Birth and Death registrations • New Population Coverage Survey • Attributes (everything else) • Attribute Survey (large to begin with) • Admin Data (Limited) • Commercial Data?
Customer Information System UK Driving Licence School Census Coverage Of Main Administrative Sources Electoral Roll Patient Register Data Higher Education Students Missing includes: Migrants not (yet) registered Newborn babies Some private only patients Missing includes: Non higher education students Independent University students Missing includes: Non-drivers Under 17’s Some foreign-licence holders Missing includes: Some migrant worker dependants Some international students Undocumented asylum seekers Missing includes: Under 17s Ineligible voters Non responders Missing includes: Non school aged people Independent school children Home schooled children HESA DVLA DVLA ER CIS SC SC Resident Population PRD ER Extras includes: Some duplicates International students on short-term courses Students ceased studying, not formally deregistered Extras includes: Short-term migrant children Extras includes: Some duplicates Some ex-pats Some deceased Short-term migrants Extras includes: Some ex-pats Some deceased Extras includes: Multiple registrations Some ex-pats Some deceased Short-term migrants Extras includes: Some ex-pats Some deceased Short-term migrants PRD CIS
2011 DWP CIS population counts compared with 2011 Census population estimates
2010 Electoral Roll population counts (age 18+) compared with 2011 Census population estimates
Criteria • Cost/benefit • Fitness for purpose i.e. ability to meet user requirements • Accuracy of the statistics produced • Frequency of the outputs e.g. updated annually, or every 5 or 10 years • Geographic level at which outputs can be produced e.g. local authorities, output areas • Consistency and comparability of the outputs across geographic areas • Technical & legal feasibility • Risk • Public burden and public acceptability.
Census Alternativemethod Benefit 2011 2021 2031 2041 Statistical benefit profile
Census Cost ???Alternative method 2011 2021 2031 2041 Cost profile (real terms)
Quality standards Population estimates (annually at ...) P1. Maximum quality of Census P2. Current variable (peak and trough) quality P3. Current average quality P4. Minimum quality in current system, Population attributes (of acceptable quality at ...) A1. MSOA data every 10 years LA data every 10 years A2. LA data annually MSOA data rolling 3 years LSOA data rolling 5 years A3. OA data every 10 years (& LSOAs MSOAs LAs )
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y We could do a census We know it is capable of delivering at least one of our quality standards
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y x x x x We wouldn’t do these as well A short form at the same time is the only sensible combination
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y x x x x Another Long form / Short form census This time an attribute survey of 40% of the population. Looks like it would satisfy our target quality (but not much saving over full census) – a possible option Long form / Short form census An attribute survey of only 10% every 10 years Will deliver our minimum quality standard – an option – but probably a weak one (probably only minimum savings compared to a full census)
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y x x x x Short form every 10 years, Rolling survey The American model – does look like it would satisfy one of our quality targets – so is an option
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y Long 100% every 10y x x x x We could do a rolling census Evidence is that this would deliver at least one of our quality targets
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x characteristics - ATTRIBUTES Long ≈ 4%+ rolling Long 40% every 10y x Long 100% every 10y x x x x But these options don’t make sense If we are running a rolling survey we might as well do the attribute survey at the same time
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x Long 40% every 10y x x Long 100% every 10y x x x x Aggregate options don’t deliver accurate enough statistics They are interesting and but fail all of our quality targets (P1, P2, P3)
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y ? x x Long 100% every 10y x x x x Partial linkage is not good enough either. This will be further tested but currently considered unlikely to be a universal option.
estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x Full linkage of admin data We believe this can deliver quality for population estimates – the size of the attribute survey depends on proving the value of the small area data.
Which option to choose ? We know that these can produce good estimates We have yet to prove that these can estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x No unit level record linkage Require us to link unit level data
Which option to choose ? We know that these can produce good estimates We have yet to prove that these can estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x COST / BENEFIT of Small area attribute data characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x No unit level record linkage Require us to link unit level data If we cannot prove that admin data works & is publically acceptable – we have to stick with a variation upon the Census Which option depends on the cost benefit case for the small area data
Which option to choose ? We know that these can produce good estimates We have yet to prove that these can estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x COST / BENEFIT of Small area attribute data characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x No unit level record linkage Require us to link unit level data If we can prove that admin data works & is publically acceptable – an admin data based solution looks very attractive The size and frequency of the attribute survey will depend on the quantified benefits of small area data
Which option to choose ? estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x Admin Linkage x x + + +
Which option to choose ? estimates of POPULATION Admin Aggre-gate Admin Part Linkage Admin Full Linkage Short 100% every 10y Short rolling ≈ 4%-10% Long ≈ 10%+ every 10y x ? x characteristics - ATTRIBUTES Long ≈ 4%+ rolling x ? Long 40% every 10y x ? x Long 100% every 10y x x x x Can we produce good population estimates from administrative data ? What is the public perception of data linkage vs. census methods ? How strong is the case for detailed population attribute data ?
Key risks of non census alternatives • Public opinion • Public acceptability research, open consultation & transparency of approach • Technical challenge – cutting edge & a big change • Built a high quality team, talking to experts and strong external assurance • Changes in administrative datasets • Building resilient solutions – but may need legislation • Devolved issues / harmonisation of UK outputs • Establishment of UK Beyond 2011 Committee + NS / RGs agreement • Ensuring decision making – getting political buy-in • This is a big decision – needs real political commitment
Why do we need a business case? • Beyond 2011 will deliver a recommendation of best future method to deliver population and socio-demographic statistics • This recommendation will need to be underpinned by robust business case to secure funding for implementation. • HM Treasury will want to see evidenced economic case • £ value of benefits > £ costs • Not straightforward for statistics • ONS’s core funding did not cover the 2011 Census • A business case had to be made for 2011
2011 Census business case • Value of more efficient allocation = £446m over ten years • Value of benefit to private sector market researchers (Census data used in their products) and supermarkets due to more profitable store locations = £274m • Total benefit = £720m • Cost = £432m so clear economic case • Quoted other case studies but without quantified benefits
Key changes affecting benefit of population statistics (1 of 2) Changes in government policy in England reduces scale of benefit to central government departments • Health allocation to Clinical Commissioning Groups largely according to registrations with GPs not population characteristics of the area • CLG allocating less funds (c£27B) than ODPM did with 2011 business case (c£44B) and proposing to freeze allocations for 7-10 years • Private sector uses • Market research, investment uses probably broadly unchanged • For market researchers, ONS population data is good (it’s free to them) but bad (it’s free to everyone else)
Key changes affecting benefit of population statistics (2 of 2) • Relative costs • With a Census... • there’s high fixed costs (for example distributing, collecting and scanning/processing returns) • Marginal cost of extra questions is low, we moved from 3 to 4 pages per person for an extra £30m in 2011 • If we can produce population estimates without a Census... • The marginal cost of attribute questions becomes much higher
What we need to do... • Current economic climate means business case will get more scrutiny than last time. • We will need to demonstrate the benefits clearly outweigh the costs to do more than the minimum necessary to meet legal requirements. • We expect the case for producing population estimates to Local Authority level to be strong • use in economic statistics (denominator for GDP, unemployment) • But the case for producing attribute and lower level data will be harder: • need the help of users to make this case (if there is one)
Three levels of quality standards for population attributes LA and MSOA every 10 years LA data annually MSOA data rolling 3 years LSOA data rolling 5 years LA, MSOA, LSOA and OA data every 10 years CASE REQUIRED Greater frequency Greater geographic detail • Increasing quality will cost more. If we’re going to recommend a higher cost and quality option, we’ll need evidence from users about the value of the extra benefit. • In particular, the case for OA data.
The case for low level data - questions • Where is data adding real value? • What would go wrong if you didn’t have low level data ? • If ONS did not provide it what would you do instead? • Can you estimate the financial consequences? • How much is the data really worth to your business ? • How will 2011 Census data to inform spending allocations? • What business decisions will be different ? Really? • What difference does it REALLY make? Are you not just USED to having it ? • For each of your uses at what level does benefit really accrue?
The case for low level data - Questions • Where is data adding real value? • What would go wrong if you didn’t have low level data ? • If ONS did not provide it what would you do instead? • Can you estimate the financial consequences? • How much is the data really worth to your business ? • How will 2011 Census data to inform spending allocations? • What business decisions will be different ? Really? • What difference does it REALLY make? Are you not just USED to having it ? • For each of your uses at what level does benefit really accrue?
Breakout session Complete templates • Use • Variables • Geography– what level do you REALLY need? • Frequency–would you trade small area detail for greater frequency • Value– and evidence Uses of population estimates – WHERE QUANTIFIABLE BENEFIT Uses of low level attribute data – WITH VALUE IF AT ALL POSSIBLE MAKE THE CASE FOR WHAT YOU MOST CARE ABOUT
Contacts Neil Townsend neil.townsend@ons.gsi.gov.uk 01329 444252 Carol Harrison carol.harrison@ons.gsi.gov.uk 01329 444219