200 likes | 337 Views
Stat 2411 Statistical Methods. Chapter 3 Measures of Location. 3.1 Populations and Samples. Population : All conceivably possible or hypothetically possible observation Sample : The particular observations actually taken. Population. Example: Temperatures of patients with meningitis.
E N D
Stat 2411 Statistical Methods Chapter 3 Measures of Location
3.1 Populations and Samples • Population: All conceivably possible or hypothetically possible observation • Sample: The particular observations actually taken
Population Example: Temperatures of patients with meningitis. There are unlimited or infinite potential observations 100.2 101.5 100.3 Population of potential measurements
Sample Sample n=10 104.0 100.2 100.8 108.0 104.8 102.4 104.2 103.8 101.6 101.4 Notation: value
3.2 The mean Sample mean= =Average =Center of gravity
Population Descriptions The Population mean is the average of all values in the population of potential values. Population mean = Population descriptions are denoted by Greek letters like Meningitis example: = average of all potential measurement of temperature of all meningitis cases.
Parameter and Statistic • Population descriptions – parameters • Sample descriptions – statistics • Sample statistics are usually used to estimate the corresponding population parameters.
3.3 Weighted mean Weight X Homework 20 90 Exam 1 8 82 Exam 2 11 87 Exam 3 13 85 Exam 4 13 92 Final 35 83
Geometric Mean (problem 3.15) Sometimes data are analyzed in the log scale (for reasons discussed later). Geometric mean = back-transformed mean of log’s x y log10x 10y
Geometric mean Example: x 1 10 100 y 0 1 2 Algebraically equivalent formula
Harmonic Mean XY Back-Transformed mean of 1/x 1 1 10 0.1 100 0.01 Example: x = time Y = rate Current: 1 mph 15 miles 3 mph upstream 5 mph downstream Harmonic mean 30miles/5 hours up +3 hours down
M = 7 M = 5 + 7 = 6 2 3.4 The Median • The median M is the midpoint of a data set. When observations are ordered from smallest to largest, M is in the middle, with half the observations smaller, half larger: 3 5 7 9 38 3 5 7 9
3 5 7 9 38 M = 7 3 + 5 + 7 + 9 + 38 5 62 5 12.4 X = = = Means vs. Medians The two values can behave VERY differently, because the Median (M) is resistant to the magnitude of possible outliers, but the Mean ( ) is not, so it can be drawn toward them.
The value that occurs most frequently Mode=108 Mode 9 10 11 12 13 06 02688888 222244666 02448 04
Fractiles • Quartiles : divide data into 4 parts. • Deciles : divide data into 10 parts. • Percentiles: divide data into a hundred parts • Among the many fractiles, quartiles are used very often in describing data. • Quartiles are the values at which 25% (Q1), 50% (Q2=Median) and 75% (Q3) of the observations fall at or below them, and can be used to describe the internal variability.
Defining the Quartiles To calculate the quartiles: 1. Arrange the observations in increasing order and locate the median M in the ordered list of observations. 2. The first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median. 3. the third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median.
M = (112+112)/2 M = 112 Calculating (Identifying) the Quartiles 26 systolic blood pressure 90 96 100 102 106 108 108 108 108 108 112 112 112 112 114 114 116 116 116 120 122 124 124 128 130 134 Q1=108 Q3=120
The Box Plot Graphing the Five-Number Summary (Min, Q1, Median, Q3, Max) Maximum (Largest Observation) Q3 (75th percentile) Values of the Variable Median M (50th percentile) Q1 (25th percentile) Minimum (Smallest Observation) • Box plots can show very large datasets & highlight skewness • Because they show less detail than histograms or stemplots, they are best used for side-by-side comparison of more than 1 dataset.