370 likes | 619 Views
Introduction to Descriptive Statistics. 2/25/03. Population vs. Sample Notation. Types of Variables. Describing data. Mean. Variance, Standard Deviation. Variance, S.D. of a Sample. Coefficient of variation. Skewness Symmetrical distribution. IQ SAT. Skewness Asymmetrical distribution.
E N D
SkewnessSymmetrical distribution • IQ • SAT
SkewnessAsymmetrical distribution • GPA of MIT students
Skewness(Asymmetrical distribution) • Income • Contribution to candidates • Populations of countries • “Residual vote” rates
A few words about the normal curve • Skewness = 0 • Kurtosis = 3
Commands in STAT for getting univariate statistics • summarize • summarize, detail • graph, bin() normal • graph, box • tabulate [NB: compare to table]
Explore Q9: Overall teaching evaluation subject q9 n 3.371 6.4375 16 3.982 6.73333 15 3.14 6.46154 13 14.02D 5.66667 3 21W.803 5.66667 12 21M.480 5.69231 13 17.906 5.28571 14 2.51 5.88235 17
Graph Q9 . graph q9
Divide into 7 “bins” and have them span 1, 1..2, 2..3, … 6..7 . graph q9,bin(7) xscale(0,7)
Add ticks at each integer score . graph q9,bin(7) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)
Add a finer grain to the bars . graph q9,bin(14) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)
Even finer grain • . graph q9,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)
Superimpose the normal curve (with the same mean and s.d. as the empirical distribution) . graph q9,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7) norm
Do the previous graph with only larger classes (n > 20) . graph q9 if n>20,bin(28) xscale(0,7) xlabel(0,1,2,3,4,5,6,7)
Draw the previous graph with a box plot . graph q9 if n>20,box ylabel
Draw the box plots for small (0..20), medium (21..50), and large (50+) classes . gen size = 0 if n <=20 (237 missing values generated) . replace size=1 if n > 20 & n <=100 (196 real changes made) . replace size = 2 if n > 100 (41 real changes made) . sort size . graph q9 ,box ylabel by(size) . graph q9 ,box ylabel by(size)
A note about histograms with unnatural categories From the Current Population Survey (2000), Voter and Registration Survey How long (have you/has name) lived at this address? -9 No Response -3 Refused -2 Don't know -1 Not in universe 1 Less than 1 month 2 1-6 months 3 7-11 months 4 1-2 years 5 3-4 years 6 5 years or longer
Solution, Step 1Map artificial category onto “natural” midpoint -9 No Response missing -3 Refused missing -2 Don't know missing -1 Not in universe missing 1 Less than 1 month 1/24 = 0.042 2 1-6 months 3.5/12 = 0.29 3 7-11 months 9/12 = 0.75 4 1-2 years 1.5 5 3-4 years 3.5 6 5 years or longer 10 (arbitrary)