1 / 37

Statistics Workshop Univariate Descriptive Statistics J-Term 2009 Bert Kritzer

Statistics Workshop Univariate Descriptive Statistics J-Term 2009 Bert Kritzer. Description vs. Inference. Summarizing a set of information Central tendency & Dispersion Shape of Distribution Relationships Form/nature of relationship Strength of relationship (“correlation”)

avon
Download Presentation

Statistics Workshop Univariate Descriptive Statistics J-Term 2009 Bert Kritzer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics WorkshopUnivariate Descriptive StatisticsJ-Term 2009Bert Kritzer

  2. Description vs. Inference • Summarizing a set of information • Central tendency & Dispersion • Shape of Distribution • Relationships • Form/nature of relationship • Strength of relationship (“correlation”) • Inference from a sample to a “population” • Statistics as estimates of population “parameters” • Inference about processes: random vs. systematic • Inference using population data • Inference as separating what’s observed into “systematic” and “random” components Observation = Systematic + Random • Random can reflect sampling and/or process

  3. Population Parameters vs. Sample Statistics

  4. Types of Variables • “Nominal” or “Categorical” (qualitative) • Unordered categories • Dichotomies (e.g., gender, win/lose) • Polytomies (e.g., religion, race) • “Interval” (quantitative) • Continuous and discrete • “Ordinal” • Ordered but values indicate only ordering • Grouped (5 point scales) and ungrouped (class rank, seniority)

  5. Describing Categorical Variables: Percentages & Simple Graphics

  6. Central TendencyThe Average or “Arithmetic Mean”

  7. Central TendencyOther Measures • Median or “middle” case • Median is a “positional measure” • Median is the 50th Percentile • Mean vs. Median • “Skewed” Data • Jury verdicts example • Mode: Most commonly occurring value • “Modal category”

  8. 1st Property of the Mean

  9. 2nd Property of the Mean

  10. Dispersion • Positional measure: Interquartile Range • Difference between the 25th and 75th percentiles (1st and 3rd quartiles) • Midspread: range from 25th to 75th percentiles (contains 50% of the cases) • Variance and Standard Deviation • Variance: Mean squared deviation: SSD/n • Standard Deviation: square root of variance • nvsn-1 as denominator • Percent of cases within one standard deviation depends on the distribution

  11. Computing the Standard DeviationCope, Table 7

  12. Five Number Summary & Boxplot(for age in a set of data) • Minimum (17) • First quartile (33) (25th Percentile) • Median (45) • Third quartile (57) (75th Percentile) • Maximum (91)

  13. Graphic for a Quantitative Variable: The Histogram

  14. Another Histogram

  15. Mean as Balance Point

  16. Distributions • Histogram Displays the Distribution • Theoretical Distributions: Derived from probability theory • Uniform • 58% within one standard deviation; 100% within 1.73 • Binomial (e.g., series of coin flips) • Concentration depends on probability parameter • Normal (bell-shaped) • 68% within one standard deviation; 95% within two • Empirical Distributions: What is observed in practice • Chebyshev's Theorem: Regardless of the distribution, no more than 25% of the observations can be more than 2 standard deviations from the mean

  17. Distributions(continued) • Symmetrical distributions • Mean = Median • Assymetrical distributions • “Skew”

  18. Uniform DistributionTheoretical

  19. Uniform DistributionEmpirical

  20. Symmetric Unimodal DistributionTheoretical

  21. Symmetric Unimodal DistributionEmpirical

  22. Asymmetric Unimodal DistributionTheoretical

  23. Asymmetric Unimodal DistributionEmpirical

  24. Bimodal DistributionTheoretical

  25. Bimodal DistributionEmpirical

  26. Bimodal DistributionGeneral

  27. 2008 Salaries of Major League Baseball Players (in $1,000’s) Mean: $3,112,101 Median: $1,000,000

  28. Normal Distribution

  29. Normal Distribution Family

  30. The 68%−95%−99.7% Rule

  31. Standard Normal (z) Table(from Cope, p. 105) Z

  32. Time Plots: Showing Change Over TimeFederal Civil Filings, 1975-2007

  33. Time Plots: Showing Change Over TimeWomen Law Graduates & Women SC Clerks

  34. The Gee-Whiz Graph

  35. Base Year IssuePercent of Incumbents Facing Competition in State Supreme Court Elections

  36. Centigrade to Fahrenheit: Change of Scale(Linear Transformation) General Linear Transformation: Transforming from inches to centimeters, a = 0 and b = 2.54. $’s to Euro’s? A mean of 5º in centigrade would be a mean of 41º in Fahrenheit. A 10º standard deviation in centigrade would be an 18º standard deviation in Fahrenheit.

  37. Standard Scores (Z-scores) Mean of 0 and standard deviation of 1

More Related