1 / 17

Dealing with Item Non-response in a Catering Survey

Dealing with Item Non-response in a Catering Survey. Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen Finnish Game and Fisheries Research Institute. The Finnish Catering Survey.

wendi
Download Presentation

Dealing with Item Non-response in a Catering Survey

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dealing with Item Non-response in a Catering Survey Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen Finnish Game and Fisheries Research Institute

  2. The Finnish Catering Survey • Studying the use of fish, crawfish, red deer, elk and reindeer in the catering sector during year 2005. • Carried out by Finnish Game and Fisheries Research Institute together with the interview organisation of Statistics Finland • Computer assisted telephone interviews at the beginning of 2006 • Population 14740, sample size 2263, stratification by “portion classes” (7). • Respondents 1741, unit non-response 498, over-coverage 24

  3. Information on amounts • The questionnaire was divided into three sections for fish, crab and game (red deer, elk, reindeer) • Among other questions every section included questions requiring amounts in kilograms, both in totals and in categories (type of product, species) and origin (domestic/imported) • The amounts in categories could be defined in percentages as well

  4. EXAMPLE: Question 8a What was the total amount of fish as raw material you used in 2005________________ kgFurthermore, estimate the form in which the fish as raw material was delivered to you? (If you cannot estimate the distribution with kilograms, estimate the proportion of the total in percents)

  5. The quality of response • It was obvious that some respondents could not provide full and exact information for these questions due to various reasons. • For example, the amounts given in classifying questions were contradictory to the overall questions. Further, the questions for domestic and foreign fish were providing different results than the overall fish consumption question. • A lot of editing work was carried out in the Finnish Game and Fisheries Research Institute in order to get the data cleaner (e.g. functional deduction between questions) and to convert the percentage information into kilograms.

  6. Still some contradictory and insufficient responses, which couldn’t be solved, were left for statistical processing. • For example, regarding total kilograms and sum of kilograms of categories we had: NOTE: Less than 10 % difference in total kilograms and sum of kilograms was allowed in the interview situation.

  7. Item non-response • The most usual case of item non-response: the category kilograms are totally missing when the overall total exists. • The sum of the existing category kilograms may either exceed or go below the overall total given in the response. • In principle the latter alternative can be considered as item non-response. • However, it is not clear how many categories are under item non-response or whether the existing category sums are simply erroneous for some part.

  8. How to correct? • How to treat full missingness of the category sums? • How to deal with category sums not matching the overall sum (mismatch sums)? Alternatives for dealing with the problems • Donor imputation • Mean imputation • Regression imputation • Weight adjustments The method in the final statistical processing was chosen from these alternatives considered in the following form:

  9. Full missingness of the category sums Corrections considered: donor imputation - Selecting a donor within a stratum (“portion category”), applying its percentages for creating the imputed values as proportions from the overall total. - Nearest neighbour class criterion by “number of kitchen staff”, “number of days serving fish”. Mismatch sums - For the cases of category sums lower than the overall sum it is hard to apply imputation, there is no information of which category/categories should get the imputation values, and the mismatch may still continue. For the opposite cases imputation is not applicable. - In order to retain distribution information on categories, the relations are proportioned up or down with a ratio

  10. Corrections considered: group mean imputation Full missingness of the category sums - Using group means of percentages for every amount category. “Portion categories” and “number of days serving fish” used as groups. Mismatch sums (as in donor imputation) Corrections considered: regression imputation Full missingness of the category sums - Using modelling for percentages in categories, various auxiliary variables tried, e.g. “number of kitchen staff”, “number of days serving fish” separately for “portion categories” (only for those kitchens, who have served fish). No better explanatory variables were available for all observations. Mismatch sums (as in donor imputation)

  11. Corrections considered: weight adjustments Full missingness of the category sums & mismatch sums - Correcting the category results by adjusting the weight separately for the different questions including amounts with a ratio i.e. the weighted overall total sum divided by the weighted sum of the category sums. - Separate weights cause inconsistencies when comparing statistics based on variables with no item non-response made either with normal weighting or adjusted weighting. Also practical problems in tabulations and analysis may occur.

  12. Actions at that time • Due to the lack of time at the estimation phase the weight adjustments were chosen. ==> conservative and quick solution => all the information on amounts were in line with each other (some kind of calibration). • The purposes of the catering survey were purely descriptive, and studies were made only at the general level and some simple classes (e.g. region). • Complex cross-tabulations and analysis were not conducted. WHAT DID THE SUBSEQUENT TESTS WITH THE CORRECTION ALTERNATIVES REVEAL?

  13. Subsequent test experiences • Inflating item non-response factor in weight adjustments varying from 1.00689 to 1.47618 • Practical choice: mean and regression imputation conducted for others than the biggest class, which had the value 100 % - sum of other percent estimates. This ensured the situation that the sum of other percent estimates was not exceeding 100 %. • The regression estimation performed so poorly (e.g. negative percentage values) that it was not considered further • Only weight adjustment replicates the original distribution of the classification amounts • The standard deviations are affected in all methods

  14. The inconsistency problem with weight adjustments (example: proportion classes) totals rounded to integers

  15. The distribution problem(example: species of fish, overall total 14036226)

  16. Weighted standard deviation changes (example: species of fish)

  17. Conclusions • The inconsistency level of the weight adjustment method was not serious • Both donor and mean imputation had a slight effect to the distribution of amounts, but not remarkable • It is clear that the weighted standard deviations were inflated by the weight adjustments, but donor imputation tended to have more varying standard deviation figures between amount categories. As expected, mean imputation had a diminishing effect on variation. • Current recommendation: Banff package for statistical editing and imputation (by Statistics Canada, constructed in SAS environment)

More Related