190 likes | 200 Views
This workshop will cover the basics of using Excel for data analysis, including form preparation, variable naming, coding data, and analyzing data. Learn how to effectively enter and clean data in Excel spreadsheets. Presented by Jennifer Lowman, Coordinator of Student Persistence Research at the University of Nevada, Reno, on February 7, 2014.
E N D
Forms to Spreadsheets A-Team Spring Brown Bags February 7, 2014 Jennifer Lowman Coordinator, Student Persistence Research University of Nevada, Reno
Outline • Excel Basics • Form Preparation • Naming Variables • Coding Data • Entering Data • Cleaning Data • Analyzing Data
Excel Basics • Workbooks & Worksheets • 3 Parts to a Spreadsheet • Rows (numbered) • Columns (alphabetized) • Cells (combo, H7)
Excel Basics • Row Cases • Columns Variables • Cells Data
Form Preparation • Case IDs • Variable Names • Data Codes Case IDs & Variable Names are Unique Identifiers It is the combination of the two that make your data meaningful.
Case IDs • Best Practices for Case IDs • Unique, Meaningful, Confidential, & Stable • Put the Case ID on every form • Major considerations • How many times are you collecting data from each person? • Do you need institutional data? • Do you need to protect confidentiality? • Employees v. Participants (mandatory v. voluntary) • Are any of the data sensitive?
Rules of Thumb • Once, with no need for institutional data • Sequential numbers, with random start (randomize forms before numbering) • More than once, no need for institutional data? • Use something meaningful to respondent • Sample size may challenge uniqueness • Need institutional data? • Use something meaningful to you
Trade-Offs • Meaningful for participant • Easy to remember (stable) • Might not be confidential • May or may not link to institutional data • Meaningful to you • Not easy to remember (not stable) • Promotes confidentiality • May need a key, risks to confidentiality • Promotes link to institutional data
Put the Case ID on Everything • Every Form, Every page • Double Check • Back Track • Multiple Coders
Naming Variables • Best Practices for Variable Names • Unique & Meaningful Abbreviations • Short Standard 8 characters • Excel can handle more, but your column size will increase • Start with letter, not # • Mnemonic strategy vs. Question Number • workhrs vs. Question1 (q001) • Mnemonic, one-time projects, with one person handling data • Question Numbers, repeated or large projects, multiple people handling the data
I use both • Meaningful Abbreviations • Less meaningful… question1 or q001 • More meaningful… q1reshall • Avoid generic, be specific • What can you expect to find for q1reshall? • Names of residence halls (Nye, Lincoln, White Pine…) • Codes for residence halls (1 = Nye, 2 = Lincoln…) • Lives in a residence hall (0 = no, 1 = yes)
Meaningfulness is tied to your coding!! • 1’s and 0’s 0 = no, does not have characteristic 1 = yes, has the characteristic • sex vs. female • what does a “0” mean? • what does a “1” mean? • race vs. white?
Coding Data • Categorical Data • Two Categories, use 0’s and 1’s, variable name should be your reference group • Three or more categories • Nominal - No meaningful numerical difference between categ. • “dummy code,” instead of one variable “race,” make five variables 0 = not Asian, 1 = Asian 0 = not Black, 1 = Black 0 = not Hispanic, 1 = Hispanic 0 = not Native American, 1 = Native American 0 = not White, 1 = White
Coding Data (cont.) • Categorical Data • Three or more categories with a meaningful, numerical difference between categories • Academic Level • 1 = Freshman • 2 = Sophomore • 3 = Junior • 4 = Senior • 5 = Second Degree • 6 = Masters Student • 7 = PhD or Professional Medical • 10 = 0 = 13 (years) • 20 = 16 = 14 • 30 = 33 = 15 • 40 = 50 = 16.4 • 50 = 66 = 17 • 60 = 83 = 18 • 70 = 100 = 22
Coding Data • Many types of coding you do when you create your survey • How committed are you to Nevada? • 1 = not committed at all … 7 = Extremely committed • Even if it is not perfect, enter that information in your spreadsheet • Then RECODE it into a NEW VARIABLE • Never throw information out • Always have a system to check your codes • Enter “Race” Then Recode (Dummy Code)
Entering Data • Enter it exactly • Recode anything that can be “quantified” into new variables • Missing Data • Leave it Blank • If you must, use an extreme number, something way out of range (-999)
Qualitative Data • Content Analysis (Implicit Quantification) • Identify themes, categories, patterns • Start Broad • Get multiple perspectives • Narrow it down to a manageable number of themes • Count
Enough Talk Let’s Do It