580 likes | 724 Views
Introduction to Biostatistics (Pubhlth 540) Lecture 3: Numerical Summary Measures. Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material. Reading/Home work. -See WEB site. For after all, what is man in nature?
E N D
Introduction to Biostatistics(Pubhlth 540) Lecture 3: Numerical Summary Measures Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material
Reading/Home work • -See WEB site
For after all, what is man in nature? A Nothing in relation to the infinite, All in relation to nothing, A central point between nothing and all, And infinitely far from understanding either. Blaise Pascal, (1623-1662) Pensees (1660)
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Example: FEV per second in 13 adolescents with asthma Let x represent FEV1 in liters
Measures of central tendency • Population Parameters • Sample Statistics • Mean • Median • Mode
Measures of central tendency • Population Parameters
Measures of central tendency: Mean Example: FEV per second in 13 adolescents with asthma 2.3, 2.15, 3.50, 2.60, 2.75, 2.82, 4.05, 2.25, 2.68, 3.00, 4.02, 2.85 (n=13)
If we collect a man's urine during twenty four hours and mix all this urine to analyze the average, we get an analysis of a urine which simply does not exist; for urine when fasting, is different from urine during digestion. A startling instance of this kind was invented by a physiologist who took urine from a railroad station urinal where people of all nations passed, and who believed he could thus present an analysis of average European urine! Claude Bernard (1813-1878)
Mean: Examples Approx 4 million singleton births, 1991 :
Mean: Examples Approx 4 million singleton births, 1991 :
Mean: Examples Approx 4 million singleton births, 1991 :
Mean: Examples Approx 4 million singleton births, 1991 :
Mean: Examples Approx 4 million singleton births, 1991 : Of 31,417 singleton births resulting in death :
Mean: Properties 26.4 years years
Mean: Properties Note what happens when one number, 4.02 say, becomes large, say 40.2 : 2.3, 2.15, 3.50, 2.60, 2.75, 2.82, 4.05, 2.25, 2.68, 3.00, 40.2, 2.85 (versus 2.95, from before) Mean is sensitive to every observation, it is not robust.
Measures of central tendency: Median More robust, but not sensitive enough. Definition: At least 50% of the observations are greater than or equal to the median, and at least 50% of the observations are less than or equal to the median. 2.15, 2.25, 2.30 --- median = 2.25 2.15, 2.25, 2.30, 2.60 --- (2.25 + 2.30) = 2.275 median =
Comparing mean and median Singleton births, 1991 :
Comparing mean and median When to use mean or median: Use both by all means. Mean performs best when we have a symmetric distribution with thin tails. If skewed, use the median. Remember: the mean follows the tail.
Mode • Mode is defined as the observation that occurs most frequently • When the distribution is symmetric, all three measures of central tendency are equal
Comparing mean, median and mode Bimodal distribution Mean, Median Modes
Measures of spread • Range: • Simple to calculate • Very sensitive to extreme observations • Inter Quartile Range (IQR) • More robust than the range • Variance (Standard Deviation): • Quantifies the amount of variability around the mean
Measures of spread: Range Singleton births, 1991 :
Measures of spread: Variance Standard deviation takes on the same unit as the mean
Variance & Standard deviation Empirical Rule: If dealing with a unimodal and symmetric distribution, then Mean ± 1 sd covers approx 67% obs. Mean ± 2 sd covers approx 95% obs Mean ± 3 sd covers approx all obs
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Variance & Standard deviation Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Characterizing a symmetric, unimodal distribution – mean, SD Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± k s.d.s
Characterizing a symmetric, unimodal distribution – mean, SD Area = 0.6475 20.56 32.4 years
Characterizing a symmetric, unimodal distribution – mean, SD Area = 0.963 14.72 38.08 years
Characterizing a symmetric, unimodal distribution – mean, SD Mother’s age: mean = 26.4 yrs s.d. = 5.84 yrs Table of ± ks.d.s
Characterizing a distribution – Chebychev’s inequality Chebychev’s Inequality Table of ± k s.d.s Proportion is at least 1-1/k2 (true for any distribution.)