Multiple Indicator Cluster Surveys Data Dissemination and Further Analysis Workshop

Multiple Indicator Cluster SurveysData Dissemination and Further Analysis Workshop Data Quality from MICS4 MICS4 Data Dissemination and Further Analysis Workshop

Looking at data quality – Why? • Confidence in survey results • Identify limitations in results • Inform dissemination and policy formulation • All surveys are subject to errors

Data quality • Two types of errors in surveys • Sampling errors • Non-sampling errors: All other types of errors, due to any stage of the survey process other than the sample design, including • Management decisions • Data processing • Fieldwork performance, etc • All survey stages are interconnected and play roles in non-sampling errors

Data quality • Sampling errors can be estimated before data collection, and measured after data collection • More difficult to control and/or identify non-sampling errors

Data quality • MICS incorporates several features to minimize non-sampling errors: A series of recommendations for quality assurance, including • Roles and responsibilities of fieldwork teams • Easy-to-use data processing programs • Training length and content • Editing and supervision guidelines • Survey tools • Failure to comply with principles behind these recommendations leads to problems in data quality

Data quality • Survey tools to monitor and improve quality, assess quality, identify non-sampling errors • Field check tables to quantitatively identify non-sampling errors during data collection and to improve quality • Possible with simultaneous data entry, when data collection is not too rapid • Data quality tables to be produced at the time of final report

Data quality • Results from data quality tables used to • Identify departures from expected patterns • Identify departures from recommended procedures • Check internal consistency • Completeness • Produce indicators of performance

Data quality analyses • Various surveys from different regions used to • compile • aggregate • graphically illustrate results in data quality tables

Age distribution by sex (DQ1) • Age heaping • Out-transference • Omission • Extent of missing cases • Sex ratios

Distribution of household members by single age

Ratio of children age 2 to age 1 Data Dissemination and Further Analysis Workshop

Completion rates by age - women and under-5s (DQ2, DQ3, DQ4, DQ5) • Fieldwork performance – re-visits, good planning • Completion rates need to be high, but also uniform by age and background characteristics • Low completion rates in younger women, for better-off groups, for less accessible or challenging areas are likely to bias results

Severe heaping on age 5 or more probably, out-transference (C3, C4, C8, C9) Completion rates are high for all ages

High ratios of women age 50-54 to 45-49 Completion rates generally high in all age groups, but very low for young women in one country (C8) Completion rates by quintiles not alarmingly different

Completeness of reporting – various (DQ6) • Missing information is problematic in surveys • Rule of thumb: Keep missing/don’t know/other cases to less than 10 percent – larger percentages may lead to biased results • No tolerance to missing information on some key variables, such as age, date of last birth

Good results for salt testing On other key variables, C2 and C5 look very problematic Poor performance on dates has an impact on (almost) all other indicators in the survey – from eligibility to calculation of indicators

Women’s Questionnaire Data Dissemination and Further Analysis Workshop

Under-5 Questionnaire Data Dissemination and Further Analysis Workshop

Completeness of anthropometric data (DQ7) • Assessing data quality of anthropometric indicators is relatively easy • Many tools have been developed for this purpose • Expected patterns, recommended procedures, completeness • Completeness of anthropometric data influenced by • Birth date reporting • Children not weighed, measured • Bad quality measurements

Large percentages of children excluded from analysis in two surveys (C2, C3) Both incomplete date of birth and poor quality measurements can be responsible Low percentages of children not weighed

Heaping in anthropometric data (DQ8) • “Digit preference” – failure to record decimal points, or round… • Or even worse, truncate • Is known to have significant impact on results • Systematic truncation of measurement results can lead to biases of up to 5-10 percent in anthropometric indicators • May be due to insufficient training, use of non-recommended equipment

Excess ratio: Percent reported with 0 or 5 / 20 • Should be close to 1.0 • Lower in weight than height

Digit preference in weight and height measurements

Observation of documents (DQ10-DQ12) • Objective is to see the maximum percentage of specified documents, and copy information from documents onto questionnaires • Better quality when majority of information is coming from documents

No significant pattern, but overall, low percentages of documents seen

Respondent for under-5 questionnaire (DQ13) • Random selection for child discipline module (DQ14) • Sex ratios among children ever born and living (DQ16)

Mothers are found and interviewed Correct selection of children for child discipline module Sex ratios among children ever born are expected to be around 1.05, within the range 1.02 to 1.07 – see C2, C8 Expected pattern: higher sex ratios among children deceased – C8?

Percentage of children incorrectly selected for the child discipline module

School attendance by age (DQ15)

Fieldwork completion Data Dissemination and Further Analysis Workshop

Number of years since date of last birth Data Dissemination and Further Analysis Workshop

Multiple Indicator Cluster Surveys Data Dissemination and Further Analysis Workshop