450 likes | 596 Views
Reconciling the Census and rolled forward Mid-year estimates at Local Authority Level Population Quality Unit April 2013. Aims of this presentation. To show how the rolled forward MYEs performed over the decade
E N D
Reconciling the Census and rolled forward Mid-year estimates at Local Authority Level Population Quality Unit April 2013
Aims of this presentation • To show how the rolled forward MYEs performed over the decade • To show why we think some of the inter-censal difference is due to international migration • To provide new insight into internal migration issues, particularly for Students and School Boarders • To discuss how real methodological improvements can have some unexpected impacts on the mid-year estimates for some local authorities • To demonstrate a joined up way of understanding how issues with the MYEs interact together
Component cohort MYEs • Take population of previous period • Remove special populations (prisoners, armed forces, school boarders) • Age everyone forward (1 year per year!) • Update special populations • Add Births • Take away Deaths • Account for international migration • Account for internal migration
The 2011 Census provides the best estimates because.... • In this presentation the 2011 Census is assumed to provide more accurate estimates of the population than any other source, this is why.... • The 2011 Census was a well designed Census with successful fieldwork – consensus that it was a good Census • Rolled forward estimates for 2011 include sampling and other errors from the 2001 Census plus uncertainty from all other components of change for 10 years • The 2011 Census provides estimates subject to sampling error (typically 95% CIs of +/- 0-2% for most LAs) • More likely that differences between the Census and rolled forward MYEs indicate issues with the rolled forward MYEs rather than the Census (more potential sources of error) • Some differences will have no specific cause – sampling error
Reconciling the MYEs and the CensusEngland and Wales • Rolled forward MYEs 464,200 (0.8%) lower than Census
Percentage difference between Census based and rolled forward MYEs for LAs 88% of local authority rolled forward estimates within 5% of Census equivalent, 62% within 2.5%. Brent Newham Forest Heath Westminster City of London Note. Rolled forward MYEs before indicative international migration
Differences between the rolled forward MYEs and Census for LAs • Some of the differences due to issues highlighted in national reconciliation • EU8 migration, armed forces, asylum seekers etc. • Some of the issues are distributional • International migrants • Internal migrants
Distribution of international immigrants • Indicative Improvements to the distribution of international immigration in November 2011 moved the rolled forward MYEs nearer to the Census based MYEs • Some of the difference between rolled forward and census base is due to the distribution of international migrants • Some of the remaining discrepancies would probably be resolved if the new method was available back to 2001
Percentage difference between Census based and rolled forward MYEs for LAs 90% of LAs within +/-5%, 70% within +/-2.5%
The importance of age/sex structure • At aggregate level the rolled forward MYEs were generally within +/- 5% of their Census equivalent (88% of LAs) • Differences between Census based and rolled forward estimates vary by age and sex • Age/sex structure is not only important but also provides evidence for how inter-censal discrepancies developed over the decade
Impact of improved distribution of international immigration on inter-censal discrepancies (E&W) Note. For each single year of age this chart shows the average absolute distance between the MYEs and the Census. A smaller distance means the MYEs are closer to the Census
Impact of improved distribution of international immigration on inter-censal discrepancies (Wales) Note. For each single year of age this chart shows the average absolute distance between the MYEs and the Census. A smaller distance means the MYEs are closer to the Census
What have we been doing? • Our role has been to try to understand why rolled forward estimates for specific local authorities are different to Census based estimates • Our work has found that internal migration is often a significant driver of inter-censal discrepancies for most local authorities
Quick refresher - Measuring internal migration • Use changes in GP patient registers (PRDS) to drive changes in the MYE • But, not all moves picked up, use difference between NHSCR and PRDS moves at Health authority level to constrain PRDS • But, young people (young men, students) slow to register at a GP • Adjust inflows for students to HESA data • Use the 2001 Census to adjust for post-study moves • Counter adjustment
In most circumstances internal migration works well • It works well for women • It works well for most age groups for men • It works well in most places • Student adjustment beneficial in many circumstances • Problems measuring moves by young men, long lags, wholly missed moves • Student adjustments not always successful
Internal migration, Lags • If a person moves but doesn’t re-register at a GP internal migration will not pick up a move • Eventually most people do re-register but the lag between the actual move and it’s measurement leads to an overestimate in their origin LA and an underestimate in their destination LA • If people move for a short time (such as sandwich students) we can miss both their outward and inward return move • People less likely to re-register if they move short distances
Internal migration: countering lags • Student adjustment issues • Generally positive in accounting for lagged moves • Occasionally the student adjustments exacerbate the problem, generally in LAs with high proportions of students with ‘unusual migration habits’ • Duplication, student adjustment and constraining to NHSCR • Both attempting to account for missed moves, occasionally these can over-do things
Not just an issue for student areas... • If there are too many students/graduates in student areas there are too few graduates in other areas (internal migration is zero sum) • Always more visible in student areas (concentration) • Longs lags + Census lead to duplicate moves • Difficulties in measuring student migration don’t just affect student LAs
Something to bear in mind.... “the most important fact in demography, we all get one year older every year” Norman B. Ryder
Oadby and Wigston (1) • Home to the halls of residence of Leicester University • Why use it for an example? • It’s a small local authority with lots of students, problems stand out • Most students in Oadby and Wigston are first years who will spend their 2nd & 3rd years in Leicester • Difference between rolled forward and Census based estimates is considerable • It’s an extreme case but one which clearly demonstrates issues with internal migration
Oadby and Wigston (2) • Students arrive aged 18/19 • Many students move less than 3 miles in their second year but cross an LA boundary • Male students don’t re-register in Leicester, leading to overestimation of 20-29 year olds • Student adjustment increases inflow to Oadby, it doesn’t increase the outflow • Not impacting the outflow is probably correct (no graduate outflow from Oadby) • But all those added to Oadby are also largely invisible to internal migration, dead weight
Oadby and Wigston (3) • Aggregate compensation • Overestimate of 20 somethings is ‘balanced’ by an underestimate of 30 somethings • Both the overestimate and underestimate are generated by the same issue, lags in re-registering at a GP • The MYEs have this aggregate compensation because they have the lags associated with the patient register which combine with the rebasing to Census
Male population of Oadby and Wigston, selected years Peak builds (overestimate) Trough builds (underestimate)
Male population of Oadby and Wigston, drivers of discrepancies 1. (Net) Inflow above and beyond reality (student adjustment, constraining of PRDS to NHSCR) The underestimate (yellow) is caused by lags leads to an equal but opposite underestimate (yellow) for the same cohort 10 years later. 2. Lags in moving students out after their first year Number of people 3. Long lags lead to extra moves out in the period after Census Age
What’s causing this? 1. Too much inflow 2. A lag in the outflow 3. Net flow too positive
Oadby and WigstonWhat about the student adjustments? Number of people Age
Oadby and Wigston Summary Real population changes MYE population change Missed moves out. Initial inflow too high – overestimate of student population (age 18/19 - 1st year of study) Missed moves between Oadby and Leicester – overestimate of student population (age 19/20 – 2nd year of study) Lags in removing student population at end of studies, overestimate of graduate population (same cohorts) (age 22-29 – post study) Rebase to Census, estimate correct. Missed moves prior to Census realised after Census – underestimate in decade following Census (age 30-39).
This matters for all local authorities... • The same mechanisms causing problems in Oadby are at work elsewhere • Oadby and Wigston is an extreme case which shows up these issues clearly • Using ‘simple’ local authorities allows us to really understand issues as there are no/few confounding factors
School boarders (1) • Intention to capture school boarder moves via special population adjustment, based on guidance given to families of boarding school pupils • However, school boarders were also registering at GPs, meaning that most moves in and out were being captured twice • Estimates of children in LAs with school boarders will be overestimated • Estimates of those in their late 20s underestimated (aggregate compensation)
School boarders (2) • The Census fixes the overestimate of school boarders but it doesn’t fix the cause of the problem (duplication of flows) • The Census corrects the population base (removing an overestimate of those aged 18) • A year later when the 19 year olds leave the area the methodology removes the school boarders population twice. Underestimate!
RutlandAn example of the school boarder issue • Why Rutland? • Rolled forward MYE 2.4% higher than Census based • It’s a small Local authority (<35,000) with lots of school boarders (>1,000) • Females used as the male population above aged 18 includes a number of armed forces personnel and prisoners (confounding factors) • School boarder population with pupils aged up to 18 • TFR in Rutland based on rolled forward-MYEs was 2.58, in top 3% of LAs. Either very fertile or missing women • TFR from Census was 1.9, extra women (not fewer births), why had we missed women?
Females in Rutland, selected years Extra change indicated by MYE 2001 population aged on 10 years Change indicated by Census
School Boarder Summary Real population changes MYE population change Inflow of boarders double counted (overestimate of cohorts of school age). Census rebasing removes double counted moves (estimate correct). Outflow of boarders double counted (underestimate of former school boarder cohorts, 19-28 year olds old).
What if the student and school boarder issues occur together? • In student areas we tend to overestimate 20-29 year olds • In school boarder areas we tend to underestimate 20-29 year olds • In an area with both students and school boarders we could end up with the right estimate because of compensating errors • This raises a tricky philosophical problem, if we fix one problem but not the other we can adversely affect the estimates for some LAs, even the method itself is better
via internal migration via special population Base population students Graduate overestimate 1994- Age 11 1993- Age 10 2006- Age 23 2002- Age 19 2008- Age 25 2001- Age 18 1995- Age 12 2007- Age 24 2005- Age 22 2000- Age 17 Actual Population- what the population would be expected to be if moves were measured correctly Population- boarders and students A lag often occurs when students graduate, between when they leave and when they re-register at their new place of residence. This can create an overestimate of graduates. Due to the effect of the boarders however it means that the estimate is in fact very close to the actual population size, despite the graduate overestimate. Over time graduates tend to gradually re-register at their new places of residence causing the graduate over-estimate to reduce in size. This causes the estimate to then move further away from the actual estimate again. The 11 year olds starting boarding school get counted in twice. Once via Special population and once via internal migration, thus creating an overcount. Students enter the area at age 19 to start University. Their moves are recorded and should be fairly accurate. The total population will still be underestimated however due to the removal of school boarders twice from the population . School boarders actually leave at age 19. Their moves out of the population now take place, where the double count is removed from the population despite having already been corrected by the Census. This then causes an undercount. The Census rebases the estimates meaning that the estimate of boarders should now be correct. This overcount remains until the Census in 2001, when it is then corrected. This underestimate remains fairly constant for the years that this age cohort are at university. Actual population
Why improving methods doesn’t lead to universal improvement (1) • Lots of local authorities with school boarders • Lots of local authorities with students • Some have both • If we fix school boarders we make school boarder areas better • But, in areas with both school boarders and students we end up uncovering a compensating error with students, this can make the estimates for some Local authorities worse
Why improving methods doesn’t lead to universal improvement (2) • A second implication of fixing school boarders is that we can uncover the national underestimate of children in the MYEs – particularly in areas that attract international migrants.
Why improving methods doesn’t lead to universal improvement (3) • We know that 20-29 year olds in student areas are prone to overestimation • We know that this must lead to underestimation elsewhere (if related to internal migration – zero sum) • We can apply an adjustment for student areas which reduces discrepancies • BUT, we don’t know exactly where the excess students/graduates should be • We fix the issue in student areas but shift/spread the discrepancy elsewhere
‘De-combined’ Error Effects Use simple rules based on relationships between data series to show where the ‘de-combined’ errors will impact and how much impact they have using a basic index.
‘De-combined’ Error Effects Through examining a sample of these dynamic outputs it is possible to see where error effects tend to occur most frequently. By drawing these together a static summary can be produced that gives users a quick guide to where the de-combined error effects are likely to arise in any LAs. • For example: • An England & Wales view that illustrates potential errors might look like the figure opposite: • Base errors might occur at any age (after 10yrs) but tend to be rare for female data at any age band • Internal migration errors tend to occur between 0-39yrs but with IPS (in) showing errors at almost every age. • Student errors are 15-49yrs (male) as this as this reflects both the early over-estimate and the later under-estimate... • ...and so on with much of this still in the very rough stages of development.
Summary (1) • The Census shows that in general the rolled forward MYEs were close to Census estimates for the majority of LAs (88% within +/-5%) • Some of the difference between rolled forward and Census based estimates for LAs is due to the national discrepancy • Some is due to the geographic distribution of international migrants (90% of LAs within +/-5% using indicative MYEs) • Our work shows that for many LAs issues with internal migration can explain some of the remaining difference • Sometimes the causes of discrepancies come from unlikely places (school boarder issues can affect 28 year olds)
Summary (2) • Internal Migration issues often involve an overestimate for one age group (20-29 year olds) and an underestimate for another (30-39 year olds) • this limits the size of the discrepancy at the aggregate level • means that the full impact of these issues can be difficult to disentangle • When we make elements of MYEs better we can uncover issues with other elements of the MYEs; estimates for most LAs will get better but for a few LAs it’s possible the estimates could become less accurate
Some Questions - • Have you carried out similar analysis on population change for your area? • Do you have other evidence of particular population groups that cause similar issues? • Are the messages about the complexity of interaction of the difference components useful? • Do compensating errors matter to you? – That is, are the components important as well as the resulting estimate?