410 likes | 421 Views
Learn how to easily convert raw research data into a format that can be readily analyzed using SPSS software. This tutorial covers data entry, data cleaning, and data analysis.
E N D
Data management Methods & Software PEER Session 02/04/15
Data Management • Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g., Excel) • Goal is to convert any research data from its ‘raw’ form to a form that can be readily analyzed
Hard Way -- Easy Way in Excel • You can fight Excel • By Formatting your spreadsheets in a traditional report style • You can work together with Excel • By Formatting your spreadsheets in a way Excel prefers
The Excel Rules • One value per cell • One type of data in each column • No blank rows • One row of column headings • Be consistent • Windhoek, WHK, WNDHK, Whoek, The Hoek… • Avoid separating similar data across tabs
Features of SPSS • Originally developed for the people in social science areas; no heavy programming background required • Designed as user-friendly and has pull-down menus to execute statistical commands • Ability to do data management & manipulations • Ability to store programs & produce reports/graphs
SPSS Data Flow Outside Data Source Importing SPSS Data File Data Modification/ Transformation Data Analysis Raw Data Direct Entry Pull-Down Menu OR Syntax Menu (Data Steps) (Analysis Steps)
Data View Window - Data Entry Site(Columns=Variables, Rows=Cases) Help Menu Pull-down Menu bar Tool bar Information bar Title bar Variable Names Data View window Active cell Action bar
Variable View WindowData Definition Site 64 Characters Max, No space Between Beg letter, @, #, or $ Numeric, String, & Others Length # of Decimals Variable Description Value Code Description Missing value Description Click here to see this view
Before we see Examples… OK Paste VS. buttons <Output File> 1. OK - results/action will be executed
Hit Paste to obtain • Syntax Window 2. Run Syntax to obtain the results in the Output Window <Syntax File>
Raw Data Subject 1 Subject # (1) Female (1) Intensive (1) Reading (90) Math (67) Subject 2 Subject # (2) Female (1) Moderate (2) Reading (72) Math (46) Subject 3 Subject # (3) Male (0) Basic (3) Reading (41) Math (73) Example - School Data
School DataVariable View Variable View Activated
Importing Excel Data file to SPSS • Open the SPSS Data file 2. Go to File Menu 3. Click “Read Text Data” 4. Click Files of type to Excel & choose Excel file 5. Hit Open 6. Check Worksheet #, Variable on the 1st row, & Hit OK
How to enter data in SPSS 1.1 Introduction of SPSS 1.2 Data Entry 1.3 Data Cleaning using SPSS
1.2 Data Entry into SPSS There are 2 ways to enter data into SPSS: 1. Directly enter in to SPSS by typing in Data View 2. Enter into other database software such as Excel then import into SPSS Let’s start with the second option, using data in Excel.
General guidelines for data entry 1. Give each variable a valid name (8 characters or less with no spaces or punctuation, beginning with a letter not a numeric number). Short, easy to remember word names. Avoid the following variable names: TEST, ALL, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, WITH. These are used in the SPSS syntax and if they were permitted, the software would not be able to distinguish between a command and a variable. Each variable name must be unique; duplication is not allowed. Variable names are not case sensitive. The names NEWVAR, NewVar, and newvar are all considered identical. 2. Encode categorical variables. Convert letters and words to numbers. 3. Avoid mixing symbols with data. Convert them to numbers. 4. Give each case a unique, sequential case number (ID). Place this ID number in the first column on the left
5. Each variable should be in its own column. Change to: Animal Group 1 0 2 0 3 1 4 1 Avoid this: Animal Control1 Control2 Experiment1 Experiment2 * Do not combine variables in one column * It is recommended to use 0/1 for 2 groups with 0 as a reference group. 6. All data for a project should be in one spreadsheet. Do not include graphs or summary statistics in the spreadsheet.
7. Each case should be entered on a single line or row. Do not copy a patient’s information to another row to perform subgroup analysis. 8. However when data are repeatedly collected over a patient, it’s recommended to have patient-day observation on a simple line to ease data management. SPSS has a nice feature to convert from the longitudinal format to horizontal format. When the number of repeats are few 2 or 3, horizontal format may be preferred for simplicity. Longitudinal data entry Horizontal data entry Date ID SYSBP 1/2/2005 1 130 1/3/2005 1 120 1/4/2005 1 120 3/1/2005 2 110 3/2/2005 2 140 ID SYSBP1 SYSBP2 SYSBP3 1 130 120 120 2 110 140
9. For yes/no questions, enter “0” for no and “1” for yes. Do not leave blanks for no. Do not enter “?”, “*”, or “NA” for missing data because this indicates to the statistical program than the variable is a string variable. String variables cannot be used for any arithmetic computation. 10. Put ordinal variables into one column if they are mutually exclusive. Preferred: Pain 1 2 3 Avoid: Pain Mild Moderate Severe 1 0 0 0 1 0 0 0 1 11. Do not make columns wider then 8 characters, unless absolutely essential.
Importing data from Excel spreadsheet into SPSS. In SPSS, go to: File, Open, Data Select Type of file (for example, Excel) you want to open Select File name you want to open
Importing data from SPSS to Excel. In SPSS, go to: Data, Save as, Select Type of file (for example, Excel) you want to save into Give File name you want to save into
1.3 Data Cleaning in SPSS 1. Re-coding existing variables 2. Creating new variables 3. Creating new variable from existing variables 4. Data labeling and formatting
Data cleaning in SPSS (1): Recoding existing variables (1) We want to use numeric coding for group instead of A and B. Old New ID Group Group 1 A 0 2 A 0 3 B 1 4 B 1
Data cleaning in SPSS (2): Recoding existing variables (2) From SPSS dialog box, go to: Transform Recode Into Same variables
Data cleaning in SPSS (1): Recoding existing variables (3) 1. Select Group from the variable box into String Variables box 2. Click on Old and new Values to proceed
Data cleaning in SPSS (1): Recoding existing variables (4) 1. Type the old value and the new value you want to convert into 2. Click on Add (To remove, or change, click on Change or Remove) 3. Type all values in the Old New box, then click Continue 4. Click OK to execute the commands.
Data Cleaning in SPSS (3) Computing patient’s age from birthday and date enrolled into the study.
Data Cleaning in SPSS (4): Data labeling and formatting (2) Data Labeling
Key Concepts • Run frequencies and descriptives to get the ‘lay’ of the data • Ensure all values are in bounds and variables are valid • Conduct descriptive analyses • Univariate • Bivariate • Multivariate • Conduct testing for differences (t-test, ANOVA, etc)