470 likes | 594 Views
Methodology for producing the revised back series of population estimates for 1992 - 2000. Julie Jefferies Population and Demography Division Office for National Statistics. Outline of Presentation. Why did the back series need to be revised? The approach taken in 2001, compared to 1991
E N D
Methodology for producing the revised back series of population estimates for 1992 - 2000 Julie JefferiesPopulation and Demography DivisionOffice for National Statistics
Outline of Presentation • Why did the back series need to be revised? • The approach taken in 2001, compared to 1991 • Explaining and quantifying the difference • The remaining difference • Possible methods for apportioning the remaining difference • Development of the final national method • The sub-national back series
1. Why did the back series need to be revised? • Population estimates provide estimates of the population in the years between censuses. • Following each census there is a new base or starting point. • A discontinuity occurs in the time series as a result of changing the base.
Population estimates for 1991-2001 (based on 1991 Census) and 2001 population estimate (based on 2001 Census)
Population estimates for 1991-2001 (based on 1991 Census) and 2001 population estimate (based on 2001 Census)
2. Approach taken in 2001 vs. 1991 Following the 1991 Census: A method for revising the 1980s back series had already been selected prior to the census interim revised estimates produced using simple period method (easy and quick to calculate) final revised estimates produced using the more sophisticated linear cohort method There was no …. examining of the reasons for the divergence evaluation of different methods … and the final method used was much simpler!
Approach taken in 2001 vs. 1991 In 2001, a three stage approach was used: Examine the reasons for the difference Quantify the impact of these reasons on the estimates over previous decade and adjust the back series Apportion the remaining difference Examine a range of methods Select the most appropriate method
3. Explaining and quantifying the difference The difference may be caused by: Issues with using the 2001 Census data (in its raw form) as a base for the mid-2001 population estimates. Accumulated error in the population estimates over the intercensal period – population drift. Possible causes - shortcomings in methodology or data sources, definitional issues.
Population estimates for 1991-2001 (based on 1991 Census) and 2001 population estimate (based on 2001 Census)
2001 Census data • A number of studies examining the reason for the difference were carried out. These included: • Demographic analysis of sex ratios, fertility, mortality and migration • Analysis of the Longitudinal Study • Comparisons with administrative sources • Investigation of census data and processes • Matching studies of address lists collected by local authorities and those held by census • The Local Authority Population Studies
Impact Conclusion: an adjusted Census base should be used for the mid-2001 population estimates. Hence the final rebased mid-2001 population estimate (September 2004) was 275,000 higher than the original rebased estimate: 193,000 due to LS adjustment and other adjustments in September 2003. 82,000 due to Local Authority Population Studies and consequential adjustments.
Intercensal population estimates Two quantifiable sources of error were also identified in the population estimates: The mid-1991 population estimates were too high because they included too big an adjustment for undercoverage in the 1991 Census. Difficulties in the estimation of international migration during the 1990s resulted in an overestimation of population growth.
Impact Mid-1991 population was revised downwards and rolled forward over the decade. The rolled-forward mid-2001 population estimate was reduced by 351,000. Following thorough methodological research, international migration estimates for the 1990s were revised. The rolled forward estimate for mid-2001 was reduced by 305,000.
4. The remaining difference Remaining difference = 209,000 Possible causes e.g. issues to do with the concept and measurement of usual residence (including changes in residence status that do not involve a migration) remaining differences in estimating international migration births to non-resident mothers Not possible to separately quantify these causes at present.
5. Apportioning the remaining difference • Two main methods: • period • cohort • Within these methods, choice of: • simple (linear) • weighted (by…..)
Period Period effect: where the error is related to a particular age, i.e. the estimates for those of that age are drifting further and further away from the truth E.g. Each year we were underestimating the number of students (age 18) leaving an area to go to university or leaving the UK on a gap year
Simple period example 1 For 20 year old males, difference between rolled forward 2001 estimates and rebased 2001 estimates is:6240 Actual error each year:
Simple period example 2 Accumulated error: This is what the existing back series estimate needs to be adjusted by.
Cohort Cohort effect: where the error is related to a particular group of people i.e. the error for this birth cohort built up gradually over the decade as they got older. E.g. in the rolled forward 2001 estimates, we have too many 45 year old males. This could be because over the decade some people born around 1956 spent periods of time abroad and were not identified as out-migrants.
Linear cohort example 1 For 45 year olds, difference between rolled forward 2001 estimates and rebased 2001 estimates is:2860. Actual error each year:
Linear cohort example 2 Accumulated error: This is what the existing back series estimate needs to be adjusted by.
Period + cohort combination? • Generally we pick whichever effect is likely to be dominant or best approximates the true situation. • A combination of both the period and cohort effects may be closest to reality. • Using a combination method is complex - need to decide for each age group what proportion of the error is due to a period effect and what is due to a cohort effect. Then need to apply constraints to ensure that the final error by age is correct. • For the 1992 to 2000 back series we started out using a cohort method (more later)…..
Simple (linear) vs. weighted Examples so far have assumed a simple (or linear) effect A linear method: weights each year of the decade equally (divides difference by 10) is easier to calculate and understand than a weighted method assumes whatever is causing the difference will have an equal impact in each year, which may not be the case
The weighted method A weighted method: weights each year of the decade by a different amount i.e. allocates a varying amount of the difference to each year. may be appropriate if the difference is likely to be driven by or correlated with a quantifiable factor this factor varies over time or by age (or both) weighted methods are much more complex
Developing the final national method The intercensal drift was thought to be correlated with migration (in particular out-migration from an area). We know that: Propensity to migrate varies with age Levels of migration change over time Apportion difference back over cohort according to propensity to out-migrate by age (IPS data). In addition, weight the difference according to level of migration (all ages over time).
Calculating migration age weights 1 IPS out-migration data for males:
Calculating migration time weights Calculate grossing factors to show how out-migration for each year compares to the average out-migration for the decade Total out-migration for decade (all ages) = 1,035,604 Average migration per year = 103,560 Grossing factor for 1992 = migration in 1992 (106,875) 103,560
Age weightings: Time weightings:
Applying the weights 4 This is what the existing back series estimate needs to be adjusted by.
Story so far…. The 1992-2000 back series was revised using a: Cohort method Weighted by out-migration Migration weights varied by both age and time QA – weighted cohort method worked well for nearly all ages….. But still some issues with teenagers Following QA and further research, a period adjustment for teenagers was included …so the final method was a weighted combination method!
Period adjustment 1 • Introduced to address a specific issue for 18 and 19 year olds • Analysis of the results obtained using the weighted cohort method suggested that there was a significant period effect associated with these ages which had not been allowed for • This is possibly due to people taking ‘gap years’ abroad at ages 18 and 19
Period adjustment 2 A proportion of the difference observed at age 18 and 19 was allocated using a time-weighted period method This proportion was determined by comparing the relative size of measured migration at 18 and 19 year olds with migration levels at younger ages The remainder of the difference was allocated using the cohort method
7. Sub-national back series (published Oct 04) • Each local authority calculated separately using method as for national estimate. • For the age weights, the national distribution was used. • For the time weights, both international out-migration (IPS) and internal out-migration were used. • Final LA estimates for each year constrained to national estimate. • QA – sex ratios and time series.
Contact details: www.statistics.gov.uk/popestemail: pop.info@ons.gsi.gov.uktel: 01329 813318 Any questions?