200 likes | 386 Views
Type of Historical Data Collected. Contextual Level (Place of Residence). Childhood Place of Residence. City / County / State of Residence during childhood Data cleaning / editing 3% resided outside of the US 44% resided in the same county as in midlife
E N D
Type of Historical Data Collected Contextual Level (Place of Residence)
Childhood Place of Residence • City / County / State of Residence during childhood • Data cleaning / editing • 3% resided outside of the US • 44% resided in the same county as in midlife • Linking with county-level census data • Chose decennial census (1930, 1940, 1950) that corresponded most closely to where participant livedat 10 years of age • Of the 12,314 participants who lived in the US as children, 99% were linked to county-level census data
Challenges of Working with Historical County Level Census Data (1930s-1950s) • Participants born in different birth cohorts are linked to different censuses • Indicators not uniformly available across census years • Exploration of techniques using prediction equations • For variables with monetary values not meaningful to use “raw” data across census years • Standardization using CPI • Ranking • Socioeconomic indicators available are limited, particularly in earlier years
Place of Residence at Ages 30, 40, and 50 • Participants asked to provide their complete street address • Goal: link with census tract data from historical census (1960 – 1980) most closely corresponding to the given age • Only queried about address for a given age if not already in ARIC at this age
Choice of Geocoder • Commercial Vendors • Marketing focus - high match rates • Products/programs are proprietary • Precise details of how spatial coordinates (latitudes and longitudes) are assigned are lacking • Information on accuracy is lacking • How closely do the coordinates / tract assignments correspond with true location? • Misclassification can lead to biased results
Choice of Geocoder (cont.) • Krieger et al (2002) Study • Compared geocodes obtained from 4 companies to “gold standard” obtained addresses • For “good” addresses “accuracy” high for 3 of the 4 vendors (95-100%) • For “problem” addresses “accuracy” ranged from 80 to 34% • Take home message – need to do your homework before choosing a vendor • Evaulation of repeatability and accuracy of geocoding vendor completed as part of our preliminary work [see Whitsel et al (in press) in publications section of this website]
Overview of Steps: Linking Address to Historical Census Tract Data • Apparently “good” addresses sent to geocoder • Obtained geocode for 1990 census tract boundaries • Latitude and longitudes obtained from exact address matches • Shape files generated with historical census tract boundaries • Latitudes and longitudes placed within tract boundaries for appropriate census year (1970-80). • Participant assigned appropriate tract number • Linked with tract level socio-environmental indicators
Geocoded Participant Addresses Placed into 1970 Forsyth County, NC Census Tracts
Oops! It Did Not Work • Major reasons for “failure” • Bad / insufficient address • Electronic files not available for 1960 • Historical census did not define tract
PROBLEM Misspellings, nonstandard abbreviations Partial street information provided Address no longer exists PO Box Rural Route SOLUTION Correction program, address standardization Attempt to hand geocode using street name or x-streets Identify location of address on historical map (MD) None None Insufficient Address
Geocoding by Hand • Obtained detailed street maps (e.g., Forsyth NC) • Overlaid with census tract boundaries from historical census • Attempted to located address on map • Assigned a census tract if the address contained within the boundaries of one track • Limitations • Labor intensive • Only feasible for areas containing a sizable number of ARIC participants (i.e., in or adjacent to ARIC study sites) • Does not work when: • Street served as a boundary for two or more tracts • Street was long / crossed multiple tracts
Hand Geocoding Problems Street crosses tract boundary Street is boundary for tract
Linking with 1960 Census Tracts • Electronic polygon and comparability files not available • Keyed information on correspondence between 1970 and 1960 tracts from print copies • Limitations • Does not work if 1970 tract is made up of more than one 1960 tract • Attempted hand geocoding
Historic Census Did Not Define Tract • Tracts not defined for all of U.S. prior to 2000 • In 1960 two major study areas (MS and MD) were missing tracts • Solutions: • Limited data (housing) obtained from census books at the level of city block areas for MS and MD • Aggregated at level of the 1970 tract boundaries • Used tract-level census data from 1970 for untracked areas in 1960 • Include a variable indicating # of years the age was from 1970 census
Percentage of Addresses Successfully Geocoded, by Census Year
Percentage of Addresses Successfully Geocoded by Method Used, by Census Year
SummaryFeasibility of Retrospective Collection of Information on Early Life Social Context and Place of Residence
Early Life Place of Residence • Almost all LCSES participants recalled a childhood place of residence (county/state) • Limitations / Considerations • Census data prior to 1960 is not aggregated at a level below the county • Limited number of socio-economic indicators • Working with monetary variables across birth cohorts / census years requires careful consideration
Constructing the Historical Context: From Address to Census Tract Data • Limited recall of past addresses, particularly at earlier ages • Partial street address data can be geocoded, but labor intensive • Commerical geocoding relatively new • Historical addresses may be obsolete • Extra step required to place address into historical tract • Limitations of Historical Census • Non-tracked areas lack data • Limited SES indicators, particularly at earlier years