370 likes | 534 Views
18a. Complex Samples Procedures in SPSS ®. Prerequisites. Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either 4. Parent and Youth Surveys or
E N D
Prerequisites • Recommended modules to complete before viewing this module • 1. Introduction to the NLTS2 Training Modules • 2. NLTS2 Study Overview • 3. NLTS2 Study Design and Sampling • NLTS2 Data Sources, either • 4. Parent and Youth Surveys or • 5. School Surveys, Student Assessments, and Transcripts • 9. Weighting and Weighted Standard Errors
Prerequisites • Recommended modules to complete before viewing this module (cont’d) • NLTS2 Documentation • 10. Overview • 11. Data Dictionaries • 12. Quick References • Accessing Data • 14a. Files in SPSS • 15a. Frequencies in SPSS • 16a. Means in SPSS • 17a. Manipulating Variables in SPSS
Overview • Complex samples • Analysis and plan files • Frequencies • Crosstabs • Means • Comparative means • Example • Closing • Important information
NLTS2 restricted-use data NLTS2 data are restricted. Data used in these presentations are from a randomly selected subset of the restricted-use NLTS2 data. Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.
Complex samples SPSS Complex Samples is a module that accounts for complex (stratified/clustered) sampling designs, correctly calculating standard errors with weighted data. Survey designs that call for complex sampling require different methods to calculate standard errors. Procedures we have used to this point assume a simple random sample.
Complex samples • Weighted standard errors produced in complex samples procedures are very different from those in basic SPSS procedures. • The Complex Samples module includes procedures for • Frequencies • Means • Crosstabs • GLM (general linear model) • Regressions. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Complex samples • Variation among methods for calculating standard errors • Different programs that produce weighted standard errors for complex samples will typically generate slightly different estimates. • Estimated standard errors are close in SAS Survey procedures, SUDAAN, and SPSS Complex Samples procedures but are not exactly the same. • There is no uniform direction for these differences; sometimes the standard errors in SPSS Complex Samples are slightly higher, and sometimes they are slightly lower. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Complex samples • Variation among methods for calculating standard errors (cont’d) • Standard errors in our reports and published tables were calculated with formulas for estimation and may be slightly different from those produced by SPSS Complex Samples procedures. • Ours tend to be slightly larger than those from SPSS. • Standard errors produced by the general procedures in SPSS—frequencies, crosstabs, or descriptives—differ greatly from those generated by Complex Samples. Don’t use unweighted standard errors! These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Analysis and plan files • How to prepare the data to use complex samples • The first step is to create an analysis data set. • Combine data or select an existing file from a given source/wave. • Once the analysis file has been created or selected, two variables need to be added to that file. • Add “Stratum” and “Cluster” found in n2sample.sav. • When the analysis and sample data are joined, the next step is to create a plan file. • A plan file is an external file that contains the sample design parameters and the appropriate weight. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Analysis and plan files • The plan file is set up through a menu-driven wizard. • Analyze: Complex Samples: Prepare for Analysis • Select “Create a Plan File” and “Browse” to assign a name and location of the plan file in the pop-up window. • Click “Next” to go to the “Stage 1 Design Variables” window. • Select “Stratum” and click the right-facing arrow to move the variable to the “Strata” box. • Select “Cluster” and click the right-facing arrow to move the variable to the “Clusters” box • Select the appropriate weight and click the right-facing arrow to move the variable to the “Sample Weight” box. • Click “Next” to go to the “Stage 1 Estimation Method” window. • Select “WR” for with replacement. • Click “Finish.” These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Analysis and plan files These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Analysis and plan files These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Analysis and plan files These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Frequencies • How to run frequencies in Complex Samples • Running a frequency or any other procedure is not much different than in the base SPSS procedures once the plan file has been created and selected. • Syntax for frequencies *Complex Samples Frequencies.CSTABULATE /PLAN FILE = 'C:\Projects\Data\MyPlan.csaplan‘ /TABLES VARIABLES = w2_Age4 /CELLS POPSIZE TABLEPCT /STATISTICS SE /MISSING SCOPE = TABLE CLASSMISSING = EXCLUDE. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Frequencies • From menu, select • Analyze: Complex Samples: Frequencies • Select sample plan file. • May not be necessary to select the file; will often remember most recent file used. • If no file is automatically selected, from “Browse” select the sample plan file created for analysis. • Select “Open” and “Continue.” • Select “Statistics” and “Table Percent” in pop-up window. • Select variable(s) and click the right-facing arrow to move to the “Frequency Tables” box. • Click “OK” or “Paste” to run from syntax editor. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Crosstabs • How to run crosstabs • Syntax for crosstabs * Complex Samples Crosstabs.CSTABULATE /PLAN FILE = 'C:\Projects\Data\MyPlan.csaplan‘ /TABLES VARIABLES = w2_Age4 BY w2_incm3 /CELLS POPSIZE COLPCT /STATISTICS SE /MISSING SCOPE = TABLE CLASSMISSING = EXCLUDE. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Crosstabs • From menu, select • Analyze: Complex Samples: Crosstabs • Select sample plan file. • Select “Open” and “Continue.” • Select “Statistics” and “Column Percent” in pop-up window. • Select the comparative (by-) variable for “Column” and the analysis variables for “Row” by selecting variables and clicking the appropriate right-facing arrow. • Click “OK” or “Paste” to run from syntax editor. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Means • How to run means • Syntax for means * Complex Samples Descriptives.CSDESCRIPTIVES /PLAN FILE = 'C:\Projects\Data\MyPlan.csaplan‘ /SUMMARY VARIABLES =ndaCalc_pr /MEAN /STATISTICS SE /MISSING SCOPE = ANALYSIS CLASSMISSING = EXCLUDE. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Means • From menu, select • Analyze:Complex Samples: Descriptives • Select sample plan file. • Select “Open” and “Continue.” • Select the variable for “Measures.” • Click “OK” or “Paste” to run from syntax editor. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Comparative means • How to run comparative means • Syntax for comparative means * Complex Samples Descriptives.CSDESCRIPTIVES /PLAN FILE = 'C:\Projects\Data\MyPlan.csaplan‘ /SUMMARY VARIABLES = ndaCalc_pr /SUBPOP TABLE=w2_incm3 DISPLAY=LAYERED /MEAN /STATISTICS SE /MISSING SCOPE=ANALYSIS CLASSMISSING=EXCLUDE. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Comparative means • From menu, select • Analyze: Complex Samples: Descriptives • Select sample plan file. • Select “Open” and “Continue.” • Select the variable and click the right-facing arrow for “Measures” and comparative variable for “Subpopulations.” • Click “OK” or “Paste” to run from syntax editor. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example • Open the file created in Module 14a, Accessing Data Files in SPSS, PrScoresEmp.Sav. • Merge sample data from n2sample.sav file. • Create a plan file called PrScoresPlan. • Weight variable will be wt_na. • Using complex samples, run • Frequency of ndaF1_Friend • Crosstab of ndaF1_Friend by na_Age4 and w2_Dis12 • Are differences significant? • If so, how do perceptions vary based on age? On disability category? • Means of ndaPC_pr • Comparative means of ndaPC_pr by na_Age4 and w2_Dis12. These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Example detail These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
Closing • Topics discussed in this module • Complex samples • Analysis and plan files • Frequencies • Crosstabs • Means • Comparative means • Example • Next module: • 19. Multivariate Analysis Using NLTS2 Data
Important information • NLTS2 website contains reports, data tables, and other project-related information http://nlts2.org/ • Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/ • General information about restricted data licenses can be found on the NCES websitehttp://nces.ed.gov/statprog/instruct.asp • E-mail address: nlts2@sri.com