140 likes | 255 Views
Assessing data critically. Module B1 Session 17. Objectives. At the end of this session the students will be able to: Apply basic techniques for error detection Ask relevant questions that allow for the explanation or correction of discrepancies. Detecting errors in primary data.
E N D
Assessing data critically Module B1 Session 17
Objectives At the end of this session the students will be able to: • Apply basic techniques for error detection • Ask relevant questions that allow for the explanation or correction of discrepancies
Detecting errors in primary data Checks to detect errors in primary data should be made at various stages: • Immediately after data collection (and during data entry) • After data computerisation • During exploratory data analysis
Checking for errors after data collection • Have all questions been answered? If not, are the reasons for non-response clear? • Are recorded values within their expected range? • Do all questions or items have meaningful entries? Are they internally consistent? • Are any zero entries genuinely zeros? • Are IDs unique?
Checking for errors after data entry Compute new (temporary) variables to check if: • Rates recorded per 1000 of population are less than 1000 • Percentages expected to be less than 100% are indeed so • There is internal consistency amongst variables, and between tables – for example, • date of interviewing should be earlier than the date when the supervisor checked the questionnaire • totals are consistent across different tables, and sub-totals add to overall totals. • Codes for missing values have been identified correctly according to their reason for missing and have been set as missing in the database to be used for analysis.
Tips for error detection • Look for counts or categories that do not make sense • If you have a series of data in chronological order, look for jumps in the data. They may be errors • Always check your totals • Make sure they add to the expected total (e.g. 100%). • When looking at multiple tables in a single study, the sample size should be consistent in all tables • What is expected to tally should tally! • Don’t just look at the numbers, look at the definitions that the numbers represent
Checks during Exploratory Data Analysis Simple one-way or two-way tables can help identify errors. (a) Results are from a socio-economic survey in Uganda. Are these results reasonable?
Checks during Exploratory Data Analysis (b) A second example from the British Crime Survey, 2000 Can the last figure be correct?
Checks during Exploratory Data Analysis (c) Detection rate of property crimes in one police force. (Data are fictitious)
Checks during Exploratory Data Analysis Consistency checks across related variables The following examples show: • Current number of cars at household versus whether respondent was worried about having car stolen. • Current number of cars at household versus whether respondent was worried about having things stolen from car. • Distance to reach any type of formal court versus distance from nearest Magistrate’s Court.
Use of cross-tabulations • Table 1. Cross-tabulation of current number of cars at household versus extent to which respondent is worried about having car stolen (Source: BCS, 2000)
Use of cross-tabulations • Table 2. Cross-tabulation of current number of cars at household versus extent to which respondent is worried about having things stolen from the car (Source: BCS, 2000)
Detecting errors in secondary data Procedures similar to the above can be undertaken,but in addition: • Ask questions regarding the source from where data arose, e.g. to assess competence, adequacy of funding, motivation for study, etc. • Ask about the data collection procedure and associated documentation. In particular seek answers to what, who,why, when, where, and how. • Important to follow the whole data chain.