1 / 35

Introduction to Descriptive Statistics

Introduction to Descriptive Statistics. 2/25/03. Population vs. Sample Notation. Types of Variables. Describing data. Mean. Variance, Standard Deviation. Variance, S.D. of a Sample. Coefficient of variation. Skewness Symmetrical distribution. IQ SAT. Skewness Asymmetrical distribution.

johana
Download Presentation

Introduction to Descriptive Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Descriptive Statistics 2/25/03

  2. Population vs. Sample Notation

  3. Types of Variables

  4. Describing data

  5. Mean

  6. Variance, Standard Deviation

  7. Variance, S.D. of a Sample

  8. Coefficient of variation

  9. SkewnessSymmetrical distribution • IQ • SAT

  10. SkewnessAsymmetrical distribution • GPA of MIT students

  11. Skewness(Asymmetrical distribution) • Income • Contribution to candidates • Populations of countries • “Residual vote” rates

  12. Skewness

  13. Skewness

  14. Kurtosis

  15. A few words about the normal curve • Skewness = 0 • Kurtosis = 3

  16. More words about the normal curve

  17. SEG example

  18. Graph some SEG variables

  19. Binary data

  20. Commands in STAT for getting univariate statistics • summarize • summarize, detail • graph, bin() normal • graph, box • tabulate [NB: compare to table]

  21. Explore Q9: Overall teaching evaluation subject q9 n 3.371 6.4375 16 3.982 6.73333 15 3.14 6.46154 13 14.02D 5.66667 3 21W.803 5.66667 12 21M.480 5.69231 13 17.906 5.28571 14 2.51 5.88235 17

  22. Graph Q9 . graph q9

  23. Divide into 7 “bins” and have them span 1, 1..2, 2..3, … 6..7 . graph q9,bin(7) xscale(0,7)

  24. Add ticks at each integer score . graph q9,bin(7) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)

  25. Add a finer grain to the bars . graph q9,bin(14) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)

  26. Even finer grain • . graph q9,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)

  27. Superimpose the normal curve (with the same mean and s.d. as the empirical distribution) . graph q9,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7) norm

  28. Do the previous graph with only larger classes (n > 20) . graph q9 if n>20,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)

  29. Draw the previous graph with a box plot . graph q9 if n>20,box ylabel

  30. Draw the box plots for small (0..20), medium (21..50), and large (50+) classes . gen size = 0 if n <=20 (237 missing values generated) . replace size=1 if n > 20 & n <=100 (196 real changes made) . replace size = 2 if n > 100 (41 real changes made) . sort size . graph q9 ,box ylabel by(size) . graph q9 ,box ylabel by(size)

  31. A note about histograms with unnatural categories From the Current Population Survey (2000), Voter and Registration Survey How long (have you/has name) lived at this address? -9 No Response -3 Refused -2 Don't know -1 Not in universe 1 Less than 1 month 2 1-6 months 3 7-11 months 4 1-2 years 5 3-4 years 6 5 years or longer

  32. Simple graph

  33. Solution, Step 1Map artificial category onto “natural” midpoint -9 No Response  missing -3 Refused  missing -2 Don't know  missing -1 Not in universe  missing 1 Less than 1 month  1/24 = 0.042 2 1-6 months  3.5/12 = 0.29 3 7-11 months  9/12 = 0.75 4 1-2 years  1.5 5 3-4 years  3.5 6 5 years or longer  10 (arbitrary)

  34. Graph of recoded data

  35. Density plot of data

More Related