1 / 14

Methods of interpolating data to create long-run time series

Methods of interpolating data to create long-run time series. Ian Gregory (University of Portsmouth) & Paul Ell (Queen’s University, Belfast). Administrative Units in England and Wales from 1801. “Minor” changes: Registration Districts (1840-1910): 400

toby
Download Presentation

Methods of interpolating data to create long-run time series

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methods of interpolatingdata to create long-run time series Ian Gregory (University of Portsmouth) & Paul Ell (Queen’s University, Belfast)

  2. Administrative Units in England and Wales from 1801 “Minor” changes: Registration Districts (1840-1910): 400 Local Govt. Districts (1890s-1972): 4,000 Parishes (1876-1972): 20,000

  3. The Newport area, 1911

  4. Creating a standard geography • Areal Weighting: • Assumption – Variable y is homogeneously distributed across the source zones • Using this: • BUT: Very unrealistic assumption.

  5. 1. Dasymetric technique: There were 15,000 parishes as opposed to 600/1,500 districts Total population is available at this scale Assumptions: The distribution of y follows the distribution of the total population Parish-level population is homogeneously distributed Problem: Most districts in towns and cities consist of only one parish. 1911, 30% of pop lived in districts that consisted of only one parish Other sources of information (1)

  6. Other sources of information (2) • 2. Data from target districts as ancillary information: • Can provide information on the distribution of source zone data • EM algorithm is used • E.g. • 1. Sub-divide target zones into rural and urban • 2. Assume that rural and urban targets have the same population densities • 3. Allocate y to targets using this assumption • 4. Find the average population density of rural and urban target districts • 5. Go back to stage three using the new population densities and repeat until the algorithm converges • Can use y for the target districts or total population at parish level as ancillary information • Relies on having relevant information for target districts

  7. Other sources of information (3) • 3. Combined technique • Brings together the dasymetric technique and the EM algorithm • Makes use of all available information • Tests all the assumptions

  8. Choice of technique Based on aggregating 1991 EDs to form pseudo-parishes and districts • Conclusions: • No one technique for all variables • Careful choice of technique reduces error significantly • Using regression techniques can help determine which is most appropriate • Error will still be appear in the interpolated data

  9. Predicting error • Possible techniques: • Space – where target zones consist of many large fragments of source zones they are error prone 2. Attribute – error is most prevalent when data have been allocated from urban zones to rural ones 3. Time – error will cause “unrealistic” changes in population

  10. Using population change to locate error Water Orton – parish on the edge of Birmingham 1901-1951, Water Orton (1951: Pop. 1,841, area 2.3km2, pop. den 796 p/km2) 1861-1891, part of Aston: (1891: Pop. 250,000, area 57km2, pop. den 4,300p/km2) 1851, Water Orton: (1851, Pop. 190, area 2.6km2, pop. den 73 p/km2) 1851: Est. Pop: 182 Actual Pop: 190 Pop. Change = (y2-y1)/(y2+y1)

  11. Using population change to locate error Birmingham 1951: Pop. 1,100,000, area 210km2, pop. den. 5,235p/km2 1931: Pop. 1,000,000, area 187km2, pop. den. 5,367p/km2 1891: Pop. 246,000, area 12.2km2, pop. den. 20,123p/km2 1851: Pop. 919, area 0.94km2, pop. den. 977p/km2

  12. Using population change to locate error Castle Bromwich – parish on the edge of Birmingham 1951, Castle Bromwich (1951: Pop. 4,356, area 4.7km2, pop. den 927p/km2) 1921-1931, part of Birmingham: (1931: Pop. 1,000,000, area 187km2, pop. den 5,367p/km2) 1861-1911, part of Aston: (1891: Pop. 250,000, area 57km2, pop. den 4,300p/km2) 1851, Castle Bromwich: (1851, Pop. 6426, area 18.7km2, pop. den 344p/km2)

  13. Conclusions • Can interpolate data to create long-run time-series • Choice of best technique will depend on nature of the variable • No “one size fits all” technique • All techniques will create some error • What to do about error: • Attempt to smooth it out • Explicitly incorporate it into an analysis

More Related