490 likes | 739 Views
Understanding Statistics. Reasons for Analyzing Data. Describe data Determine if two or more groups differ on some variable Determine if two or more variables are related Reduce data. Nominal categories race hair color Ordinal rank order baseball standings waiting list placements.
E N D
Reasons for Analyzing Data • Describe data • Determine if two or more groups differ on some variable • Determine if two or more variables are related • Reduce data
Nominal categories race hair color Ordinal rank order baseball standings waiting list placements Interval equality of intervals performance ratings temperature Ratio true zero equality of ratios salary height Types of Data
The Concept of Significance • Interoccular Significance • Statistical Significance • Practical Significance
Significance Levels • Indicate the probability that results occurred by chance • Standard is .05, but others can be used • Type I error: Concludes there is a difference when in fact there is none • Type II error: Concludes there is no difference when there is one
When deviating from the .05 level, consider the common sense of your finding previous research the quality of your data the cost of being wrong Probability level is influenced by sample size differences between groups within group variability Statistical Significance
Significance Levels in Journal Articles The job satisfaction level of female employees (M=4.21) was significantly higher than that of male employees (M=3.50), t (60) = 2.39, p < .02. Academy Score Commendations _____________ _____________ Cognitive ability .43** .03 Education .28** .24* __________________________________________________ * p < .05, ** p < .01, *** p < .001
Statistics That Describe Data • Sample Size • overall (N) • subgroups (n) • Frequencies • Central Tendency • mean (statistical average) • median (midpoint) • mode (most common) • Dispersion • range • variance & standard deviation
Measures of Central Tendency • Mean • Median • Mode
The Median • Median is the point at which 50% of your data fall above and 50% fall below • Odd number of scores, the median is the middle score • Even number of scores, the median is the average of the two middle scores 93 98 98 100 104 110 114 121 102
The Mode The Most Frequently Occurring Score
Which Measure of Central Tendency Should I Use? • Mode • nominal data (categories) • Mean • interval data • ratio data • Median • ordinal (ranked) data • interval or ratio data if • outliers • skewed distribution
Measures of Dispersion • Range • Minimum • Maximum • Spread • Variance (s2) • Standard deviation (s) • Square root of the variance • 1 SD = 68% of scores • 2 SD = 95% of scores
Performance Ratings Mean
Salary Survey Example • Salary Survey Data • Mean for police officer is $25,000 • SD = $3,000 • Our Department Salary • $24,000
We know that a student’s GPA is one standard deviation above the mean
Caution About Inferences From Standard Deviations • Inferences can be made only when • Data are normally distributed • Sample size is large • If conditions are not met, using percentiles based on actual data is best
Number of tickets written at two police departments
Measures of Comparison and Explanation • Percent • Percentile • Q1 • Q2 • Q3 • Standard Score (Z) • mean of zero • standard deviation of 1 • T-Score
Reasons for Errors Inaccurate source data Copied incorrectly from source data Input error misread keystroke error conversion error Input statement error Methods to Check Proofread raw data “Sure thing” analysis that didn’t work Use descriptive statistics to check for values outside the possible range check for values that don’t make sense Using Descriptive Statistics to Ensure Data Integrity
What Statistic to Use • Frequencies • Chi Square • Means • two groups: t-test • Analysis of Variance • more than two groups • more than one independent variable • Analysis of Covariance • more than one dependent variable • controlling for other variables
Goodness of Fit Does the observed frequency differ from the expected frequency Example % % Secretary 92 80 Welder 20 25 Supervisor 40 50 Tests of Independence Does the distribution for one group differ from that of another Example Hired Not Male 32 16 Female 10 20 Differences in FrequenciesChi-Square
The t-test Tests Differences in Means Between Two Groups
Assumptions Normal distribution Equal variances in each group Size and Significance Differences in means Amount of within group variance Sample size Journal Listing t (45) = 2.31, p < .01 Differences Between Two MeansThe t-test
Analysis of Variance Tests differences in means when there Are more than two groups White $23,121 African-American $20,243 Hispanic $21,176 West Virginian $18,543 Is more than one independent variable White Black Total Male $28,100 $21,900 $25,000 Female $24,000 $22,000 $23,000 Total $26,050 $21,950 $24,000 Is an interaction between the two independent variables
Interpreting the Results of an ANOVA DF SS MS F p < Sex 1 382106006 382106006 13.16 .0004 Race 1 42857538 42857538 1.48 .2260 Race * Sex 1 14079430 1079430 0.48 .4871 Error 174 5051526673 29031762 Total 177 935490569647 White Black Total Male $45,008 $43,349 $44,621 Female $41,556 $41,330 $41,505 Total $43,874 $42,708
What is an F Ratio? The between group variance divided by the within group variance An F of 1.0 indicates that there are equal amounts of within and between groups variance t is the square root of F significance determine by size of F and sample size Sample Size Cautions Sample size in each cell should be reasonable (at least 10) Sample size in each cell should be about equal or at least proportional to the marginal totals Interpreting an ANOVA
Multiple ComparisonsExample Employee Education Performance Rating _________________ ________________ GED 3.13 High school diploma 3.41 Associate’s degree 4.26 Bachelor’s degree 4.35 Master’s degree 4.37
Multiple ComparisonsConsiderations • Planned vs. post hoc comparisons • Planned contrasts • Post hoc contrasts • Scheffee • Tukey HSD • Newman-Keuls • Duncan • Fischer’s least significant difference test • Number of comparisons made • Bonferroni Adjustment
Analysis of Covariance DF SS MS F p < Covariates Education 1 2036063 2036063 0.10 .76 Years in company 1 132707859 132707859 6.33 .02 Years in grade 1 83553431 83553431 3.99 .06 Years experience 1 16708479 16708479 0.80 .38 Sex 1 12096720 12096720 0.58 .46 Uncorrected Corrected Male $41,399 $38,236 Female $37,859 $36,682 Difference $ 3,540 $ 1,554
Interpreting Correlations • Direction • Positive • Negative • Magnitude • Distance from zero • Comparison to norms • Utility analysis • Type of Relationship • Linear • Curvilinear
Interpreting Correlations • Types of Correlation • Pearson • Spearman rank order • Point biserial
Regression • Enables prediction • Allows combinations of small correlations • Accounts for overlap of variables • Two main types • Stepwise • Hierarchical
Regression Formula Y = a + (b1) (x1)+ (b2) (x2) Y = predicted criterion score a = constant (intercept) b = weight (slope) x = score on the predictor
Things to watch for Total number of subjects Subject-to-variable ratio Multicollinearity Inclusion of nonsignificant variables Missing variables Types of equations Raw score Standard score Types of regressions Stepwise Hierarchical Regression
Interpreting Regression Results Performance = 3.67 + (.10)(IQ) + (.59)(Interview)