620 likes | 732 Views
Using SPSS for Windows Part II. Jie Chen Ph.D. Email: jie.chen@umb.edu Phone: 617 287 5241. Table of Contents. Data management Computing new variables To sort data Data selection and split files Merging files Statistical procedures Linear regressions Regression for aggregated data
E N D
Using SPSS for WindowsPart II Jie Chen Ph.D. Email: jie.chen@umb.edu Phone: 617 287 5241
Table of Contents • Data management • Computing new variables • To sort data • Data selection and split files • Merging files • Statistical procedures • Linear regressions • Regression for aggregated data • Chi-square test for grouped data • Nonparametric tests • Testing Normality
Computing New Variables • Open data sample1.sav • To compute a new variable we can • Use a standard formula • Use a statistical function to compute
Using a Formula To compute the average income for the past three years for each person: • Click Compute in the Transform menu, • Enter the new variable with the name of “mean” for the target variable Mean = (ptoi92+ptoi93+ptotinc)/3 • Click OK to compute the mean
Using a Statistical Function • Click the Compute in the Transform menu • Click the Reset button to clear the old formula • Enter average as the target variable • Locate Mean on function list and move it to the Numeric Expression area (using Up arrow ) • Enter ptoi92, ptoi93 and ptotinc inside the parentheses • Click OK to compute the average
Log transformation • Click the Compute in the Transform menu • Click the Reset button to clear the old formula • Enter lnincome as the target variable • Click on Arithmetic in Function group: text box • Locate Ln on functions and Special Variables: list and move it to the Numeric Expression area (using Up arrow ) • Enter ptotinc inside the parentheses • Click OK to compute log of ptotinc.
Sorting Data Sorting data involves reordering of data using values of one or more variables. • Sorting data on one variable • Sorting data on more than one variables
Sorting Data on One Variable • Click Data/Sort Cases in the Data Editor Window • Click age and move itto the “Sort by:” text box • Click Ascending radio button • Click OK
Sorting Data on Two Variables • Click Data/Sort Cases • Click age and move itto the “Sort by:” text box • Click educ and move itto the “Sort by:” text box • Click Ascending radio button • Click OK
Three Ways of Data Selection • If condition is satisfied : to select data that meet if conditions • Random sample of cases:randomly chose a specified percentage of cases • Based on time or case range: to select data from a specified range
If Condition Is Satisfied To choose data that meet If conditions: • Click the Select Cases in the Data menu • Click the If condition is satisfied radio button • Click If push button to open the Select Cases: If dialog box
The If condition If we are interested in the personal total income for females, we need to select the only observations whose sex is female. • Type in sex = 1 in the Select Cases: If dialog box, (1 = “female”) • Click Continue to confirm the rule
Two Choices for Unselected Cases • If one clicks the Filtered radio button, the unselected cases remain in the Data Editor, but are not used in analyses. • If one clicks the Deleted radio button the unselected cases are deleted from the Data Editor Window.
Complex If conditions Suppose we want to select cases meeting two conditions: region = 1 and age >= 30 • Type in “region = 1 & age>=30” in the Select Cases: If window • Click Continue to confirm the rule
The Case Deletion Choice • Switch to the Data Editor Window • Click the Select Cases in the Data menu • Click the Deleted radio button in the Unselected Cases Are: area • Click the OK to delete unselected cases from Data Editor Window
The Data Editor Window Containing Only Selected Observations
Split File • The data file is split into separate groups for analysis based on the values of a grouping variable • The same analysis is applied to separate subgroups simultaneously • The results for all the subgroups will be presented together
To Split a Data file • Open sample2.por • Click the Split File in the Data menu • Click the Organize output by groups radio button • Move sex to the the Groups Based on list box • Click the OK push buttonto Split File
Descriptive Statistics Based on Split File • Click Statistics/Summarize/descriptive • Click age in variable list box • Click OK
Turn Off the Split File Processing • Select Split File in the Data menu • Click Analyze all cases in the Split File dialog box • click OK to set analyses to all cases (turn off split file)
Merging Files Data can be combined in two ways • Merging different cases according to the same variables (adding observations) • Merging different variables according to the same cases (adding variables)
Merging Cases In the Data Editor Window • Open a data file row1.sav • Click Data/Merge Files/Add Cases, the dialog box of Add cases: Read File is open as shown in the note page • Select file row2.sav and Click open, then the dialog box of Add Cases from... is open • Click OK, the observation from row2.sav are placed in Data Editor Window after row1.sav
Merging Variables • Open file col1.sav • Click Data/Merge Files/Add Variables. The dialog box Add Variable: Read File shown in the note page will be displayed. • Select file col2.save and Click open. Then the dialog box of Add Variable from... Will appear • Click OK.
Introduction to Regression • Simple Regression • Multiple Regression • Regression Plots • Regression for aggregated data
Simple Regression • Click Analyze/Regression/Linear then the Linear Regression dialog box is open • Use ptotinc (personal total income) as the dependent variable • Use educ as the independent variable • Click OK
Examing the Residual • Click Dialog Recall Tool • Click Linear Regression • Click plots… inthe Linear Regression dialog box • In the Linear Regression: Plots dialog box, chose ZRESID as the Y and ZPRED as the X variables. • Click Histogram • Click Continue • Click Ok
The Fitted Model Y = -13301+ 2672 X1-13106 X2 + 145 X3
Residual Plots • Click Plots in Linear Regression Dialog Box • Put ZRESID as the Y variable and ZPRED as the X variable in a scatterplot • Chose Histogram and Normal probability plot in the Standardized Residual Plots
To aggregate data • Using Current Population Survey 2006 (CPS2006) data • Click on Data/Aggregate Data • Break Variable(s): • Summaries of Variable(s): • Mean, Median, and Sum • First, Last, Minimum, and Maximum values • To save aggregated variables