1 / 18

An Evaluation of the Accuracy of U.S. Bureau of the Census County Population Estimates

An Evaluation of the Accuracy of U.S. Bureau of the Census County Population Estimates. by Dean H. Judson, University of Nevada, Reno Carole Popoff, Decision Analytics, Inc. Michael J. Batutis, U.S. Bureau of the Census, Population Division. This study:.

andrew
Download Presentation

An Evaluation of the Accuracy of U.S. Bureau of the Census County Population Estimates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Evaluation of the Accuracy of U.S. Bureau of the Census County Population Estimates by Dean H. Judson, University of Nevada, Reno Carole Popoff, Decision Analytics, Inc. Michael J. Batutis, U.S. Bureau of theCensus, Population Division

  2. This study: • As primary data collection becomes more expensive, administrative records will become the primary collection point - work needs to be done to ensure data quality and to develop corrections for errors and biases. • We examine the sources of data used in the Administrative Records method for intercensal county estimates, and further develop our model of error in the estimates process. • We use the results of the model to develop a correction factor usable within the current county estimation system and to propose specific improvements to the current system. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  3. Sources of Administrative Records Data used in the County Method • Births and deaths: • supplied by state vital statistics, or • supplied by NCHS in lieu of state reporting. • Group Quarters counts: • supplied by the state or the entity that collects the counts, or • last year’s figures if the current count is not provided by the state. • Foreign immigration: • INS intended place of residence (for documented immigrants), and • fixed percentage applied to an estimate of undocumented immigrants (for undocumented immigrants). • 65 and older: • Medicare records. • Net migration: • IRS tax returns matched year over year. • See paper for substantial additional detail. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  4. Our Basic Approach • We argue that: • U.S. administrative records data can be used as an indicator of the population of interest, but not (currently) as a direct count. (Finland appears to have mostly solved problems with direct use.) • Administrative records data collection methods suffer from biases caused by the data collection mechanism itself: • Conceptual differences between data and the object being measured; • Reporting lags; • Tendency to geographically misallocate events; and • Inaccurate or incomplete record matching. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  5. Our Basic Approach (cont.) • We hypothesize reasons why the data collection mechanism may not represent the underlying population. • Using these hypotheses, we identify county-level variables which would be indicative of: • The underlying condition (i.e. rapid change, high Medicare non-enrollment); • The group unlikely to be covered by the administrative record (i.e., those in poverty, counties without hospitals) or • other “reporting problems” (i.e. lag in GQ reporting) Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  6. Example of How a Bias Is Created in the Migration Rate • The net migration rate can be positive or negative. • If an unemployed non-filer gets a job in another county, moves and files next year, then the tax return is not matched. • This family is not added to the number of movers, thus • the net migration rate in the county of origin is too high and the county of destination is too low. • See tables 1 and 2 in the paper. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  7. Hypothesized Sources of Estimation Error for Group Quarters Counts • A person files a tax return and is matched, but subsequently enters a group quarter and is also counted there (or the opposite). • There will be an over- (under-) estimate for county of origin. • Limited group quarters updating by the FSCPE may not capture change in this population. • Using last year’s numbers will create over-estimates under conditions of decline, or under-estimates in conditions of growth. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  8. Hypothesized Sources of Estimation Error For Natural Increase • If a birth mother must go to another county because there is no hospital in her county of residence, then, • the birth may be recorded in the county where the hospital is located. • Therefore, • there will be an over-estimate in the county where the hospital exists, and • an under-estimate in the county of birth mother’s residence. • Deaths can experience the same phenomenon; however, • since the number of births > number of deaths we expect an incorrectly high level of natural increase and an over-estimate. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  9. Hypothesized Sources of Estimation Error for Foreign Immigration • A county’s proportion of foreign persons will change but illegal immigrants are allocated using percentages from the 1990 census. • This results in an under- or over-estimate depending on the direction of change in the proportion of foreign born in a particular county. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  10. Hypothesized Sources of Estimation Error for Population 65 Years of Age and Older • The propensity to sign up for Medicare will be conditioned on knowledge of the program. For example, • The foreign born, less educated, or others with language or cultural barriers are less likely to enroll. • Under-enrollment is not uniform across counties; we predict under-estimates in counties with high proportions of persons 65 and older and not enrolled. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  11. Methodology - Data • 1990 Census counts are compared to the Administrative Records Method using 1990 estimates generated by Davis (1994). • We calculate Algebraic Percent Errors (ALPEs) for each county. • The key is that this reflects magnitude AND direction of error. • Bureau of the Census USA Counties CD-ROM data are extracted. • We identify socio-economic variables as indicators of the source of error. • The two sources of data are matched by FIPS codes. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  12. A Note on Census Undercounts • The key error in the Census count is undercount. • We obtained from Post Enumeration Survey materials the adjusted population counts. • We created a corrected ALPE (CALPE) measure using adjusted population counts. Correlation between ALPE and CALPE = .968 • We re-ran analyses using CALPE as the error measure, and compared results to ALPE. • At this time, we feel confident that our hypotheses for bias are almost entirely independent of the undercount correction. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  13. A Description of the Models • The final model was developed by adding groups of variables indicative of the hypothesized bias foreach type of administrative record. • Model 0 contains 4 regional variables, total 1990 population, and the change in population from 1980 to 1990, and represents a “null” model. • Model 1 contains the null model and adds: • birth/death misallocation indicators • group quarters misreporting indicators • Over 65 underreporting indicators • Immigration indicators • Net internal migration bias indicators Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  14. Results are consistent with our hypotheses • Model 1 significantly improves upon Model 0. • In the presence of the additional indicator variables, the estimated coefficients on variables in the null model tend toward zero. • When we use 1990 population corrected for undercount, little change in coefficients occurs, with one exception: • “Indians in poverty” is strongly positively correlated with ALPE, holding other effects constant; its effect drops to near zero in the CALPE model. • Other effects remain substantially the same. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  15. Summary of Substantive Results • Consistent with other studies, counties with rapid change tend to have estimation errors opposite the direction of change. • Counties with a high rate of natural increase tend to be overestimated. • Rural counties with no hospital tend to be underestimated. • Counties with a large proportion of the population in group quarters tend to be underestimated, and this effect is strongly exacerbated if the county is also growing. • A notable exception to #4 is counties with a high percentage of its population in prison. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  16. Summary of Substantive Results (Cont) • Counties with a high proportion of SSI enrollees tend to be overestimated, but if the county has a high proportion of the 65 and over population that is not enrolled, the county tends to be underestimated. • Counties with a high percentage foreign born tend to be underestimated. • Counties with a high percentage of the population in poverty tend to be overestimated. • Counties with a high percentage of native American Indian population tend to be underestimated. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  17. Summary of Substantive Results (Cont) • Our model implies a correction factor: 1+XB is the correction factor. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

  18. Implications for Post-2000 • General conclusion: Using an administrative records data series for any estimation method requires careful consideration of “program effects” creating systematic biases. • Specific proposals for post-2000: • Our results provide a correction factor for the estimates series. After further careful study, this correction factor should be incorporated into the estimates process. • In particular, we should use the 1990 correction factor to generate a 2000 estimate and determine whether the corrected estimate is better or worse than an uncorrected estimate. • Vital statistics geocoding needs to be improved at all levels. • FSCPE Group Quarters update and reporting is very important. • A Medicare coverage study would be useful for assessing predictors of undercoverage. Contact: Dean H. Judson, Ph.D. e-mail demogecon@aol.com

More Related