160 likes | 297 Views
Measures of Position. 3.4. The standard deviation is a measure of dispersion that uses the same dimensions as the data (remember the empirical rule) The distance of a data value from the mean, calculated as the number of standard deviations, would be a useful measurement
E N D
The standard deviation is a measure of dispersion that uses the same dimensions as the data (remember the empirical rule) • The distance of a data value from the mean, calculated as the number of standard deviations, would be a useful measurement • This distance is called the z-score Z-Score
If the mean was 20 and the standard deviation was 6 • The value 26 would have a z-score of 1.0 (1.0 standard deviation higher than the mean) • The value 14 would have a z-score of –1.0 (1.0 standard deviation lower than the mean) • The value 17 would have a z-score of –0.5 (0.5 standard deviations lower than the mean) • The value 20 would have a z-score of 0.0 Z-Score
The populationz-score is calculated using the population mean and population standard deviation • The samplez-score is calculated using the sample mean and sample standard deviation Z-Score
z-scores can be used to compare the relative positions of data values in different samples • Pat received a grade of 82 on her statistics exam where the mean grade was 74 and the standard deviation was 12 • Pat received a grade of 72 on her biology exam where the mean grade was 65 and the standard deviation was 10 • Pat received a grade of 91 on her kayaking exam where the mean grade was 88 and the standard deviation was 6 • Calculate each z-score and see what class has the highest RELATIVE grade. Z-Score
Statistics • Grade of 82 • z-score of (82 – 74) / 12 = .67 • Biology • Grade of 72 • z-score of (72 – 65) / 10 = .70 • Kayaking • Grade of 81 • z-score of (91 – 88) / 6 = .50 • Biology was the highest relative grade Z-Score
The median divides the lower 50% of the data from the upper 50% The median is the 50th percentile If a number divides the lower 34% of the data from the upper 66%, that number is the 34th percentile Percentile
The quartiles are the 25th, 50th, and 75th percentiles • Q1 = 25th percentile • Q2 = 50th percentile = median • Q3 = 75th percentile • Quartiles are the most commonly used percentiles • The 50th percentile and the second quartile Q2 are both other ways of defining the median Quartiles
Quartiles divide the data set into four equal parts • The topquarter are the values between Q3 and the maximum • The bottomquarter are the values between the minimum and Q1 Quartiles
Quartiles divide the data set into four equal parts The interquartilerange (IQR) is the difference between the third and first quartiles IQR = Q3 – Q1 The IQR is a resistant measurement of dispersion IQR
Can we find the Quartiles with a Calculator? • Data • 1,2,3,4,5,6,8,10,15,20 Calculator
Extreme observations in the data are referred to as outliers • Outliers should be investigated • Outliers could be • Chance occurrences • Measurement errors • Data entry errors • Sampling errors • Outliers are not necessarily invalid data Outliers
One way to check for outliers uses the quartiles • Outliers can be detected as values that are significantly too high or too low, based on the known spread • The fences used to identify outliers are • Lower fence = LF = Q1 – 1.5 IQR • Upper fence = UF = Q3 + 1.5 IQR • Values less than the lower fence or more than the upper fence could be considered outliers Finding Outliers
Are there any outliers? 1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54 • Calculations (You can use your Calculator to find these!) • Q1 = 7 • Q3 = 27 • IQR = 20 • Lower Fence = Q1 – 1.5 IQR • Upper Fence = Q3 + 1.5 IQR Finding Outliers
z-scores • Measures the distance from the mean in units of standard deviations • Can compare relative positions in different samples • Percentiles and quartiles • Divides the data so that a certain percent is lower and a certain percent is higher • Outliers • Extreme values of the variable • Can be identified using the upper and lower fences Recap!