10 likes | 84 Views
Società Italiana di Statistica Convegno Intermedio Venezia, 6-8 Giugno 2007. Evaluation of the impact of substitutions for unit nonresponse in the labour force survey of the municipality of Firenze. Andrea Giommi (1) Emilia Rocco (1)
E N D
Società Italiana di Statistica Convegno Intermedio Venezia, 6-8 Giugno 2007 Evaluation of the impact of substitutions for unit nonresponse in the labour force survey of the municipality of Firenze Andrea Giommi (1) Emilia Rocco (1) (1) Dipartimento di Statistica Università degli Studi di Firenze (ITALY) (giommi@ds.unifi.it - rocco@ds.unifi.it ) Introduction Since 1995, the Municipality of Firenze designed a quarterly labour force (LF) survey, parallel to that of ISTAT, to cope with the unavailability, at municipality level, of accurate estimates for the main survey items. In a first phase, the survey was carried out through an enlargement of the ISTAT sample. Beginning from 2002, a new survey still based on the ISTAT questionnaire but with a different sampling design was introduced. The new survey is based on a stratified proportional random sample of 1200 single persons for each quarter. Strata are defined on the basis of sex, age and residence areas. As far as unit nonresponse and unit rotation are concerned, the same strategy of ISTAT is followed, that is, substitution for nonresponse and 2-2-2 rotation. From 2004, after a first face to face interview, sample units have been reinterviewed by telephone and from the subsequent year all the interviews have been performed by CATI technique. Both the use of the telephone file for sample selection and the substitution for unit nonresponse may produce some bias in the estimates. The portion of the bias due to the incompleteness of the telephone file with respect to the registry office list, which must be considered the best frame for a survey on residents, is obviously worth to be studied. Howeverin this work we focus on the effect of substitution for nonresponse which cannot be seen as the best strategy for coping with unit nonresponse and as well as the list problems can be responsible for a large part of the bias. 1. The study variables In this work respect to the LF survey carried out by the municipality of Firenze we consider the estimation of employment and unemployment rates at municipal level • 2. Three forms of survey nonresponse adjustments • Estimates obtained by means of simple direct estimators, after substituting for non response (substitutions are performed within the strata which are defined on the basis of the variables sex, age and residence area that seem strictly associated to the main study variables as well as the response behavior) are compared with two other types of estimates: • Estimates based only on respondents after modifying the weights of the respondent units through calibration based on the stratification variables plus the number of component of the family • Estimates based only on respondents after adjusting for nonresponse through estimated individual response probabilities pi. To estimate the individual response probabilities we choose as a good proxy of the response probability the variable “number of hours that each subject spend daily at home from 8 a.m. to 8 p.m. from Monday to Friday” that is available for almost all respondents. The more the hours spent at home the higher the response probability. More in detail we assume for pi the expression • where xiis the number of hours spent • at home by unit i and 0.6 is chosen in order to roughly reproduce the observed response rate by the summation of the pi over the respondents. 3. Considerations on the survey nonresponse adjustment methods and results Even if substitution are performed within a large number of strata defined on the basis of variables that may be supposed associated to the study variables as well as to the response behavior, we think that the relationship between higher contact probability and the employment status may be a source of bias. According to this consideration we observe that substitutions are not uniformly distributed by sex and age. Regarding age the most of substitutions falls in the central classes which are typically those with the highest level of employment. The consequence may be an overestimate of the unemployment rate and an underestimate of the employment rate. Since substitutions in the central classes show a slightly higher rate of employment with respect to the “first call respondents”, the estimates based only on respondents after modifying the sample weights through the calibration tend to increase the unemployment rate with respect to the estimates after substitutions. When calibration is used to cope with non response may sometime produce non response adjustment that are inconsistent with logical hypothesis on response behavior. In our case if the most of non respondents are males, fall in central age classes and consequently are likely employed and less contactable the adjustment method for nonresponse should increase the estimated total of employment people with respect to the direct estimates. Unfortunately calibration seems to leave unchanged the total number of employed and active people and to increase the total number of unemployment people. The consequence is that the unemployment rate estimated by calibration is higher than that estimated by substituting for non response. Some results are showed in table 1. An alternative approach to the nonresponse problem consists in adjusting for nonresponse by means of estimated response probabilities before calibrating. We chose to define the response probabilities as a function of a variable which seems directly correlated with the individual contact probability: persons who usually spend 12 hours at home have contact probability near to 1, while persons that are never at home in the interviews time may have a response probability slightly greater than 0.5 considering that they cannot be find only if they live alone, but may be contacted if they belong to a family with more than one component. Some results are showed in table 2. 4. Final Remarks As we can see from the tables when using the individual response probabilities the estimate of the employment rate is greater than that obtained by the sole calibration technique whilst the estimate of the unemployed rate slightly decreases and is very near to the one obtained by substitutions. This does not mean that substitutions are in general a good practice but simply that in the LF survey of the municipality of Firenze they tend to balance a certain want of employed people in the set of respondent to the first call of each wave of the survey. The use of estimators adjusted by means of individual response probabilities instead of substitutions can produce both good estimates and savings in the collection of the information. Deeper analyses are certainly needed in order to assess the best strategy. Concerning the estimation of individual response probabilities, for example, we used a very simple parametric function of a single variable. One may certainly assume that the response mechanism depends not only on the time spent at home but also on other variables. • References • Deville, J.C. and Särndal, C.E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association 87, 376-382. • Ekholm, A. and Laaksonen, S. (1991). Weighting via response modeling in the Finnish Household Budget Survey. Journal of Official Statistics 3, 325-337. • Kalton, G. and Kasprzyk, D. (1986). The treatment of missing data. Survey Methodology 12, 1-16. • Lundström, S. and Särndal, C.E. (1999). Calibration as a standard method for treatment of nonresponse. Journal of Official Statistics. • Oh, H.L. and Scheuren, F.J. (1983). Weighting adjustment for unit nonresponse. In: W.G. Madow, I. Olkin and D.B. Rubin (eds.), Incomplete Data in Sample Surveys, Vol. 2. New York: Academic Press, 143-184