1 / 14

Improving Imputation: The Plan to Examine Count, Status, Vacancy, and Item Imputation in the Decennial Census

Improving Imputation: The Plan to Examine Count, Status, Vacancy, and Item Imputation in the Decennial Census . Arthur Cresce, Sally Obenski, and James Farber U.S. Census Bureau Work Session on Statistical Data Editing of the United Nations Statistical Commission for Europe

afi
Download Presentation

Improving Imputation: The Plan to Examine Count, Status, Vacancy, and Item Imputation in the Decennial Census

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Imputation: The Plan to Examine Count, Status, Vacancy, and Item Imputation in the Decennial Census Arthur Cresce, Sally Obenski, and James Farber U.S. Census Bureau Work Session on Statistical Data Editing of the United Nations Statistical Commission for Europe Ottawa, Canada, May 16-18, 2005

  2. Development of Imputation Methods • 1940 census – first use of imputation • 1950 census – more extensive use of imputation - also the first census to use computers • “Hot deck” imputation method first developed in the 1960 census • Hot deck used for both population and housing characteristics

  3. Hot Deck Imputation • Table or “matrix” – store values of reported “donor” responses • Stratify hot deck using various characteristics of donors (e.g.,age, sex) • Use stratification characteristics to match “donee” (person needing a value) with donor and impute response

  4. Characteristics of Hot Deck Imputation • Nearest neighbor (spatially) – attempt to assign values from donors who live near the donee • Sequential – storing and assigning of values follows order of geographic sort of housing units

  5. CNSTAT Panel Review of Census 2000 • Relies on single donor without obtaining more information from local area • Does not fully incorporate multivariate nature of imputation • May have difficulty doing simultaneous imputation of several variables that are correlated • Fails to produce an error estimate

  6. Research Effort Focuses on Two Key Imputation Areas • Count Imputation • Housing Unit Status • Occupancy Status • Household Size • Characteristics Imputation (e.g., age, sex, relationship, housing tenure)

  7. Alternative Imputation Methodologies • Administrative Records – direct assignment • Administrative Records – modeling • Spatial Modeling • Canadian Census Edit and Imputation System (CANCEIS) • Modified Hot Deck

  8. Table Showing Methodologies to Be Applied to Each Imputation Type

  9. Strategy for Technically Evaluating Imputation Methodologies • Create truth deck – take records from 100% files with no imputed values and simulate nonresponse patterns • Run each alternative imputation method against truth deck • Compare resulting distributions – calculate comparison statistics

  10. Evaluation Criteria • Numerical and distributive accuracy (apportionment) • Operational feasibility and cost effectiveness • Public Understanding

  11. Operational Feasibility Issues • Complexity • Impact and interrelationship of external systems and subsystems • Number of operating systems, run times, file formats, programming languages • Security issues • Degree of human intervention

  12. Time Frame for Analysis • Alternative methodologies already run on truth deck files • Analysis of results underway • Goal – select methodology or “hybrid” methodology for use in 2006 Census Test

  13. Limitations of Analysis • Truth deck reflects what respondents reported – status as “truth” is an assumption • Methodology for creating truth deck may itself create an unknown bias that favors a particular method • Multiple evaluation measures – difficult to develop an overall “summary” measure

  14. Summary • Research methodology designed to: • Assess the accuracy of each imputation method • Assess the feasibility of each method • Address the concerns identified by oversight groups • Identify the optimal method or combination for further testing and possible use in 2010

More Related