1 / 48

Understanding Statistics

Understanding Statistics. Reasons for Analyzing Data. Describe data Determine if two or more groups differ on some variable Determine if two or more variables are related Reduce data. Nominal categories race hair color Ordinal rank order baseball standings waiting list placements.

kim
Download Presentation

Understanding Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understanding Statistics

  2. Reasons for Analyzing Data • Describe data • Determine if two or more groups differ on some variable • Determine if two or more variables are related • Reduce data

  3. Nominal categories race hair color Ordinal rank order baseball standings waiting list placements Interval equality of intervals performance ratings temperature Ratio true zero equality of ratios salary height Types of Data

  4. The Concept of Significance • Interoccular Significance • Statistical Significance • Practical Significance

  5. Significance Levels • Indicate the probability that results occurred by chance • Standard is .05, but others can be used • Type I error: Concludes there is a difference when in fact there is none • Type II error: Concludes there is no difference when there is one

  6. When deviating from the .05 level, consider the common sense of your finding previous research the quality of your data the cost of being wrong Probability level is influenced by sample size differences between groups within group variability Statistical Significance

  7. Significance Levels in Journal Articles The job satisfaction level of female employees (M=4.21) was significantly higher than that of male employees (M=3.50), t (60) = 2.39, p < .02. Academy Score Commendations _____________ _____________ Cognitive ability .43** .03 Education .28** .24* __________________________________________________ * p < .05, ** p < .01, *** p < .001

  8. Statistics That Describe Data

  9. Raw Data Are Not Usually Meaningful

  10. Statistics That Describe Data • Sample Size • overall (N) • subgroups (n) • Frequencies • Central Tendency • mean (statistical average) • median (midpoint) • mode (most common) • Dispersion • range • variance & standard deviation

  11. Measures of Central Tendency • Mean • Median • Mode

  12. The Mean

  13. The Median • Median is the point at which 50% of your data fall above and 50% fall below • Odd number of scores, the median is the middle score • Even number of scores, the median is the average of the two middle scores 93 98 98 100 104 110 114 121 102

  14. The Median

  15. The Mode The Most Frequently Occurring Score

  16. Which Measure of Central Tendency Should I Use? • Mode • nominal data (categories) • Mean • interval data • ratio data • Median • ordinal (ranked) data • interval or ratio data if • outliers • skewed distribution

  17. Measures of Dispersion • Range • Minimum • Maximum • Spread • Variance (s2) • Standard deviation (s) • Square root of the variance • 1 SD = 68% of scores • 2 SD = 95% of scores

  18. Number of Days Absent Mean

  19. Performance Ratings Mean

  20. IQ Scores for Two Training Groups

  21. Salary Survey Example • Salary Survey Data • Mean for police officer is $25,000 • SD = $3,000 • Our Department Salary • $24,000

  22. The Normal Curve

  23. We know that a student’s GPA is one standard deviation above the mean

  24. Caution About Inferences From Standard Deviations • Inferences can be made only when • Data are normally distributed • Sample size is large • If conditions are not met, using percentiles based on actual data is best

  25. Number of tickets written at two police departments

  26. Measures of Comparison and Explanation • Percent • Percentile • Q1 • Q2 • Q3 • Standard Score (Z) • mean of zero • standard deviation of 1 • T-Score

  27. Reasons for Errors Inaccurate source data Copied incorrectly from source data Input error misread keystroke error conversion error Input statement error Methods to Check Proofread raw data “Sure thing” analysis that didn’t work Use descriptive statistics to check for values outside the possible range check for values that don’t make sense Using Descriptive Statistics to Ensure Data Integrity

  28. Statistics That Test Differences Between Groups

  29. What Statistic to Use • Frequencies • Chi Square • Means • two groups: t-test • Analysis of Variance • more than two groups • more than one independent variable • Analysis of Covariance • more than one dependent variable • controlling for other variables

  30. Goodness of Fit Does the observed frequency differ from the expected frequency Example % % Secretary 92 80 Welder 20 25 Supervisor 40 50 Tests of Independence Does the distribution for one group differ from that of another Example Hired Not Male 32 16 Female 10 20 Differences in FrequenciesChi-Square

  31. The t-test Tests Differences in Means Between Two Groups

  32. Assumptions Normal distribution Equal variances in each group Size and Significance Differences in means Amount of within group variance Sample size Journal Listing t (45) = 2.31, p < .01 Differences Between Two MeansThe t-test

  33. t-value Needed for Significance

  34. Analysis of Variance Tests differences in means when there Are more than two groups White $23,121 African-American $20,243 Hispanic $21,176 West Virginian $18,543 Is more than one independent variable White Black Total Male $28,100 $21,900 $25,000 Female $24,000 $22,000 $23,000 Total $26,050 $21,950 $24,000 Is an interaction between the two independent variables

  35. Interpreting the Results of an ANOVA DF SS MS F p < Sex 1 382106006 382106006 13.16 .0004 Race 1 42857538 42857538 1.48 .2260 Race * Sex 1 14079430 1079430 0.48 .4871 Error 174 5051526673 29031762 Total 177 935490569647 White Black Total Male $45,008 $43,349 $44,621 Female $41,556 $41,330 $41,505 Total $43,874 $42,708

  36. What is an F Ratio? The between group variance divided by the within group variance An F of 1.0 indicates that there are equal amounts of within and between groups variance t is the square root of F significance determine by size of F and sample size Sample Size Cautions Sample size in each cell should be reasonable (at least 10) Sample size in each cell should be about equal or at least proportional to the marginal totals Interpreting an ANOVA

  37. Multiple ComparisonsExample Employee Education Performance Rating _________________ ________________ GED 3.13 High school diploma 3.41 Associate’s degree 4.26 Bachelor’s degree 4.35 Master’s degree 4.37

  38. Multiple ComparisonsConsiderations • Planned vs. post hoc comparisons • Planned contrasts • Post hoc contrasts • Scheffee • Tukey HSD • Newman-Keuls • Duncan • Fischer’s least significant difference test • Number of comparisons made • Bonferroni Adjustment

  39. Analysis of Covariance DF SS MS F p < Covariates Education 1 2036063 2036063 0.10 .76 Years in company 1 132707859 132707859 6.33 .02 Years in grade 1 83553431 83553431 3.99 .06 Years experience 1 16708479 16708479 0.80 .38 Sex 1 12096720 12096720 0.58 .46 Uncorrected Corrected Male $41,399 $38,236 Female $37,859 $36,682 Difference $ 3,540 $ 1,554

  40. Interpreting Correlations • Direction • Positive • Negative • Magnitude • Distance from zero • Comparison to norms • Utility analysis • Type of Relationship • Linear • Curvilinear

  41. Interpreting Correlations • Types of Correlation • Pearson • Spearman rank order • Point biserial

  42. Regression • Enables prediction • Allows combinations of small correlations • Accounts for overlap of variables • Two main types • Stepwise • Hierarchical

  43. Regression Formula Y = a + (b1) (x1)+ (b2) (x2) Y = predicted criterion score a = constant (intercept) b = weight (slope) x = score on the predictor

  44. Things to watch for Total number of subjects Subject-to-variable ratio Multicollinearity Inclusion of nonsignificant variables Missing variables Types of equations Raw score Standard score Types of regressions Stepwise Hierarchical Regression

  45. Interpreting Regression Results Performance = 3.67 + (.10)(IQ) + (.59)(Interview)

More Related