200 likes | 214 Views
Explore the development of synthetic households and locations for various applications using the NICS Tool Kit. Learn about methodologies to generate social and economic data for colonias in the Southwest Border. Discover innovative approaches for allocating data and overcoming challenges in sparse rural areas.
E N D
Modeling Synthetic Households and Locations for Multi-Use Applications NICS CoP Research Symposium “The Emerging NICS Tool Kit” June 30, 2005 Jon Sperling, Ph.D. U.S. Department of Housing and Urban Development Office of Policy Development & Research Demin Xiong, Ph.D Oak Ridge National Laboratory
Multiway Table Family Type Margin Race Margin Income NICS Tool Kit Spatial/statistical tools and approaches user-defined small area geographies P U MS S A M P L E S Confidentiality was not focus of work but methodology can be used to approach confidentiality issue
Spin-off Development From Short-Term Practical Need • Support HUD Colonia Program and Policy Initiatives • Develop and align colonia boundaries with Census TIGER • Develop methodology to allocate SF1 and SF3 census statistics to colonia boundaries • Generate social and economic data for colonias
What are Colonias? • Many definitions (see http://maps.oag.state.tx.us/colgeog/colonias.htm • HUD’s definition • - An “identifiable community” in Arizona, California, New Mexico, or Texas, within 150 miles of the Mexico border (excluding MAs > 1 million inhabitants) that is determined to be a colonia on the basis of objective criteria, including lack of potable water supply, adequate sewage systems, or decent, safe, and sanitary housing; (National Affordable Housing Act of 1990)
Challenges • Lack of current or reliable data on colonia boundaries or population • Colonias do not generally follow census boundaries (e.g., census tract, block group, block) • Lack of common definition • Lack of shared knowledge base • Survey costs are high • Predominantly rural areas • County-level data is too coarse
Census SF3 files are available at the block group level, boundaries are shown as green lines. The colonia boundaries on the other hand, shown in red, don’t follow block group boundaries. In some cases, a colonia includes one or several block groups. In other cases, a colonia is a small component of only one block group.
Current Work • Create SW Border Colonia Research Database (in cooperation with Texas) • Develop allocation methodologies that can be re-used for other applications using public domain data • Create dynamic allocation methodology not dependent on static boundaries (re-use)
Allocation-Related Methodologies • Polygon Subdividing and Weighting (block level) SF1 variable allocation • Household Synthesis (block group level) Model synthetic households for block group and then redistribute to blocks SF3 variable allocation • Synthetic Household Location Generation Assign synthetic household data to actual or modeled point locations (in absence of known coordinates).
Allocation Approaches • Allocating data from one set of boundaries to another set of boundaries that establish different subdivisions of the same area • Lam and Goodchild (1980) areal interpolation – classic GIS function (polygon overlay) assumes homogeneous distributions. Others have adapted other approaches (e.g., dasysmetric, continuous surface, etc.) Tobler (1979), Bracken and Martin (1989), Langford and Unwin (1994) Halloway (1997) Eichen and Brewer, Mennis (2003), etc. Availability of data, computational efficiency (size, raster,vector)
Polygon Subdividing and Weighting • The SF1 (complete count) allocation is based on a procedure that uses the percentage of areas and differential density of inside or outside road buffers. • The methodology - generates 100 meter buffers along identified roads - splits census blocks based on inside/outside buffer - overlays colonia boundaries to form basic polygon allocation units - computes allocation weight of each basic unit based on assigned density factor - extracts SF1 tables and aggregates weighted variables for all allocation units inside a colonia to get the SF1 estimate
Household Synthesis Method • Higher variability in HU and POP densities and demographic characteristics at BG level • Aims to create explicit representation of locations and demographic attributes of individual households (or household members) • Major advantage- modifying geographic boundaries has no effect on results of data aggregation
Household Synthesis Method • Synthesizes locations and attributes of individual households using combination of micro household sample data and census statistical data at both block group and block levels. • Uses SF3 tables and PUMS to inference cross relationships and correlation patterns between SF3 variables to answer such questions as: Number of households by income by family type by race by household size and by number of people employed in the household.
Cross Relationships between Variables Structural Table Family Type Margin Race Margin Income
Use the structural (multiway) table to synthesize family household, non family household, and group quarter population. Because each synthesized household has all characteristics such as income, race, age, etc, these variables can be aggregated to reconstruct SF3 tables at any type of geography if the location of the household is known. P U M S
Spatial Allocation of Households Household locations are modeled and represented with points that are randomly generated using constraints of Census variables at block or block group level (e.g., total household numbers, type, size, and presence of children).
Comparison between Even Distribution and Block Constrained Household Distributions Different distribution patterns between the use of block group-based constraints and block-based constraints (points of light gray represent block-group-based constraints, while points of purple represent block-based constraints). The use of block-based constraints allow more accurate household distribution patterns.
Variable Generation and Validation • Variable Generation for Colonias • Variables for Colonias generated based on synthesized individual households • Variables include demographic and economic characteristics (family type, race, income, etc). • Self-Calibration • Comparison between SF3 variables and estimated variables with synthesized household characteristics in the BLOCKGROUP level • Comparison between SF1 variables and estimated variables with synthesized household characteristics in the BLOCK level
SF3 Block Group Table PUMS Household Samples Aggregation Extraction Multiway Summary Table For PUMA Selected SF3 Marginal Totals For Block Group Adaptation Adapted Multiway Table for Each Block Group Household Redistribution Synthesized Households For Each Block Group Household Redistribution Synthesized Households For Each Block Location Synthesis Households With Assigned Location Colonias Boundaries Overlay and Aggregation SF3 Statistics For Colonias Data allocation using synthesized household locations and attributes