80 likes | 134 Views
Learn how to preplan, code, and clean data effectively with key tips and tools. Ensure accurate analysis by avoiding common coding mistakes and implementing thorough data cleaning steps.
E N D
PA430 - Data coding March 7/8, 2000
Codebook • Purpose • Guide data coding process • Guide to locate variables and interpret findings during data analysis • Given the code book and data base, any researcher should be able to understand your data collection and do analysis using the data
Preplanning • As for all stages of research, pre-planning of data coding saves you many headaches later. • Issues • Type of analysis • Availability of computer applications • Other resources: time, expertise, computer • Record storage
Preplanning • It is advisable to lay out a draft of the codebook, data coding sheet, etc. before collecting data • Surveys • edge coding • consistency • Choice of computer application • spreadsheet vs SPSS
Codebooks • Key elements • introductory information including researcher’s name and institution, purpose of data collection, name of data base, etc. • variable names (3-8 characters) • sometimes a numerical code is used instead • full definition/description of variable or wording of question • variable attributes (response set) with codes
Data Cleaning • All coders will make mistakes • Sometimes even necessary for data obtained from other sources - always check • Plan for data cleaning should involve multiple steps/re-checks
Helpful hints • Save file every 10 minutes or so as you code • Keep more than one copy of the data (i.e., on a floppy and on a hard drive) • Save a copy of the original data. If you recode, save the file under a new name. • Save original surveys, coding sheets, etc. for a reasonable period of time
Data cleaning • Step 1 - “eyeball” the data • Step 2 - run frequency table for each variable • look for “impossible” response codes • Possible response 1,2,3, and you find a 4 • look for unexpected patterns • I.e., all cases coded the same for a variable • Step 3 - run descriptive statistics for each variable • again, looking for inconsistencies