170 likes | 298 Views
Missing income data in the millennium cohort study: Evidence from the first two sweeps Authors: Denise Hawkes and Ian Plewis Discussant: Nicholas Biddle nicholas.biddle@anu.edu.au. Introduction and overview. Data – Millennium Cohort Study
E N D
Missing income data in the millennium cohort study: Evidence from the first two sweeps Authors: Denise Hawkes and Ian Plewis Discussant: Nicholas Biddle nicholas.biddle@anu.edu.au
Introduction and overview • Data – Millennium Cohort Study • Research questions – What are the factors associated with non-response? More specifically: • Are there within household and individual correlations for missing income data? • Is the sex of the interviewer an important explanatory variable? • How is missing data in sweep one related to missing data in sweep two? • Is attrition at sweep two related to the level of household income or the failure to provide data in sweep one? • Method – • Descriptive analysis • Binary and Multinomial Logit models with non-response as dependent variable • Binary Logit with attrition between sweep one and sweep two as dependent variable
Data • Millennium Cohort Study • First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families). Interviewed when baby was 9 months old • Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when children around 3 years old. • Information from main respondent (usually mother) and partner of respondent (usually father) • Incomplete information on income through: • Unit non-response (response rate 72% in first sweep) • Partner non-response (88% of families with partners responded) • Item non-response for income (6% of main respondents and partners did not provide income data) • Attrition between sweeps (79% of eligible families responded in sweep two) • Income information: • Collected from those currently doing paid work, those who have a paid job but are on leave, those who have worked in the past but have no current job. • For employees – total take home pay and gross pay • For self employed – ‘amount you personally took out of the business after all taxes and costs’
Data • Millennium Cohort Study • First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families). Interviewed when baby was 9 months old • Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when children around 3 years old. • Information from main respondent (usually mother) and partner of respondent (usually father) • Incomplete information on income through: • Unit non-response (response rate 72% in first sweep) • Partner non-response (88% of families with partners responded) • Item non-response for income (6% of main respondents and partners did not provide income data) • Attrition between sweeps (79% of eligible families responded in sweep two) • Income information: • Collected from those currently doing paid work, those who have a paid job but are on leave, those who have worked in the past but have no current job. • For employees – total take home pay and gross pay • For self employed – ‘amount you personally took out of the business after all taxes and costs’
Data • Millennium Cohort Study • First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families). Interviewed when baby was 9 months old • Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when children around 3 years old. • Information from main respondent (usually mother) and partner of respondent (usually father) • Incomplete information on income through: • Unit non-response (response rate 72% in first sweep) • Partner non-response (88% of families with partners responded) • Item non-response for income (6% of main respondents and partners did not provide income data) • Attrition between sweeps (79% of eligible families responded in sweep two) • Income information: • Collected from those currently doing paid work, those who have a paid job but are on leave, those who have worked in the past but have no current job. • For employees – total take home pay and gross pay • For self employed – ‘amount you personally took out of the business after all taxes and costs’
Patterns of income response • Original sample (paper has information on new families and proxies)
Patterns of income response • Original sample (paper has information on new families and proxies)
Other modeling – Multinomial Logit and attrition • Multinomial Logit – Response vs. don’t know vs. refuse • Main respondent: • Self employed only significantly more likely to be ‘don’t know’ not ‘refusal’ • Same with social class variables • Black or Black British as well as Northern Ireland more likely to refuse • Partner respondent: • Self employed significantly more likely to refuse and not know • NVQ levels and ethnicity both associated with refusal • Attrition at sweep two • Higher income in sweep one associated with lower odds of attrition between sweep one and sweep two • Main income and partner income non-response in sweep one associated with higher odds of attrition between sweep one and sweep two
Other modeling – Multinomial Logit and attrition • Multinomial Logit – Response vs. don’t know vs. refuse • Main respondent: • Self employed only significantly more likely to be ‘don’t know’ not ‘refusal’ • Same with social class variables • Black or Black British as well as Northern Ireland more likely to refuse • Partner respondent: • Self employed significantly more likely to refuse and not know • NVQ levels and ethnicity both associated with refusal • Attrition at sweep two • Higher income in sweep one associated with lower odds of attrition between sweep one and sweep two • Main income and partner income non-response in sweep one associated with higher odds of attrition between sweep one and sweep two
Summary • Household and individual correlations for missing income data • Self employment, some ethnic groups (though not consistent), Northern Ireland • The sex of the interviewer is not an important explanatory variable in explaining income non-response • Some variables only associated with ‘don’t know’ or ‘refusal’ only • Missing data in sweep one associated with higher odds of missing data in sweep two • Especially amongst partner respondents • Higher household income in sweep one associated with lower attrition in sweep two • Missing data in sweep one associated with higher attrition in sweep two
Suggested further work and information • Models for non-response • More diagnostic information (e.g. tests of group significance) • Information on the child? • Interviewer bias • Multilevel model? • Interactions or other information on the interviewer • Implications for survey design • Difference between don’t know and refusal