370 likes | 546 Views
Evaluation of Person-based Migration Methodology. Presented to FSCPE Meeting Internal Migration Processing Team Local Area Estimates and Migration Processing Branch U.S. Census Bureau September 26, 2006. Contents of Presentation. Description of Return-based and Person-based
E N D
Evaluation of Person-based Migration Methodology Presented to FSCPE Meeting Internal Migration Processing Team Local Area Estimates and Migration Processing Branch U.S. Census Bureau September 26, 2006
Contents of Presentation • Description of Return-based and Person-based • Summary of Issues and Recommendations • Evaluations • Future R/Ds
Return-Based Method • Internal Revenue Service sends tax extract file to Census Bureau • Drop names and assign unique Person Identification Keys (PIK) derived from SSNs • Run edit process and assign county code to each return based on ZIP+4 • Two consecutive years of tax data are matched on primary filer’s PIK
Return-Based Method(Cont’d) • Compare county codes on matched returns to define migration • Tally exemptions for in-, out-, and non-migration components • Compute Net Internal Migration Rate (NMR) for Under 65 household population: NMR = (In-migrants – Out-migrants) / (Non-migrants + Out-migrants)
Return-Based Method(Cont’d) • Match Year-1/Year-2 matched IRS file to PCF to obtain demographic characteristics for primary filers • Demographic characteristics for the spouse and dependents are imputed based on the characteristics of the primary filer • Migration status is assigned based on the migration status of the primary filer • Produce state and county migration data by age, race, sex, and Hispanic origin
Limitations of Return-Based • Underestimate the moves associated with life-events (e.g., divorce, marriage, first job etc.,) • Demographic characteristics of spouse and dependents are imputed based on the characteristics of the filers • Migration status of spouse and dependents depends on the filer.
Person-Based Method • Start with the return-based edited file • Records created for filer, spouse, and all dependents (up to 4); one record per each individual on the tax return • Unduplicate the records by applying selection rules • Assign county code to each record • Matched across two consecutive tax years on PIK
Person-Based Method(Cont’d) • Compare county codes on matched returns to define migration • Tally exemptions for in-, out-, and non-migrants • Compute Net Migration Rate (NMR) for Under 65 household population: NMR = (in-migrants – out-migrants) / (non-migrant + out-migrants)
Person-Based Method(Cont’d) • Match Year-1/Year-2 matched IRS file to PCF to obtain demographic characteristics for filer, spouse, and dependents (No imputation!!) • Migration status is individually assigned to filer, spouse, and dependents based on the assigned county codes (No imputation!!) • Produce state and county migration data by age, race, sex, and Hispanic origin
Issues requiring decision making rules Issue 1. Duplicate Records/Zero Exemptions: • Multiple records are created for one person if the person’s SSN is claimed on more than one tax return, including zero exemption returns • Need to decide which records to keep
Zero Exemption • Filed when a dependent child has enough income to report to the IRS • The parent claims separately the dependent on his or her tax return • 87 percent of the duplicate records involve zero exemptions returns
Issues requiring decision making rules Issue 2. Excess exemptions: • The number of SSNs recorded on a tax return does not match the number of exemptions claimed on the same return • We need to decide whether we create a dummy record for each excess exemption
Summary of IssuesZero exemptions • Retain the zero exemption record and drop the dependent record • Addresses on zero exemption returns are likely to be more accurate
Summary of Issues Other duplicate records • Filer record trumps all! • Retain primary filer records and drop spouse and dependents records
Summary of Issues Excess Exemptions • Fewer SSNs than exemptions claimed • Exclude excess exemptions 2. More SSNs than exemptions claimed (i.e., negative excess exemptions) • Include the provided SSNs and ignore negative negative excess exemptions
Divorce Scenario Year 1 Year 2 Return-Based Person-Based 1 Non-Migrant 1 Non-Migrant Non-Match 4 Migrants 1 Filer 1 Filer Cty A Cty A 1 Spouse 1 Filer 3 Deps 3 Deps Cty A Cty B
Student Scenario Year 1 Year 2 Return-Based Person-Based 1 Deps 1 Filer 1 Non-Match 1 Migrant Cty A Cty B
EvaluationMatch Rates - Definition Year-1/Year-2 Match Rate = (Year-1 and Year-2 Matched Record Count) * 100 / Total Year-1 Record Count PCF Match Rate = (Year1,Year2, and PCF Matched Count) * 100 / (Year1 andYear2 Matched Count)
The 10 Lowest Year1-Year2 Match Rates from Return-Based Records from Years 2000 through 2004 (National Average = 90.5%)
The 10 Lowest Year1-Year2 Match Rates from Person-Based Records from Years 2000 through 2004 (National Average = 94%)
PCF Match Rates • The match rates from the person-based records were almost the same as the match rates from the return-based records (> 99%).
Matched Y1-Y2 Under-Age-65 Exemptions: Percent of Exemptions Migrating by Exemption Status (10 Percent Sample)
Coverage Analysis by State • Coverage patterns are consistent across states and years • Person-based coverage was consistently lower than return-based coverage • The states with the most extreme coverage rates under return-based processing maintained the same pattern under person-based processing • The difference in coverage declined for every state between 2000 and 2004. The highest difference was –5.30 in 2000 and –0.48 in 2004
Number of Inter-county Migrants:Person-based vs. Return-based
Inter-county Migration Percent:Person-based vs. Return-based
Race and Hispanic Origin Distribution:Person-based vs. Return-based
Migration Rate Outliers Definition Outliers 95% Confidence Interval Outliers
Findings from Outlier Analysis • The person-based method had significant effect on the migration flows from the counties with small population to the counties with large population • The new method had the largest impact on individuals in their early 20s
Summary of Findings • The person-based method will produce more accurate migration estimates. • The characteristics from the person-based records will be more accurate than the return-based.
Future R/Ds • Integration of Electronic File to enhance the coverage of child dependent • Integration of Medicare data at the micro level to produce the migration data for the 65+