220 likes | 281 Views
Understand quartiles, range, standard deviation, and variance in data analysis. Learn how these concepts help interpret data variation effectively.
E N D
SP 225Lecture 8 Measures of Variation
Challenge Question • A randomized, double-blind study of 50 subjects shows daily administration of Echinacea supplements shortens the average duration of an Upper Respiratory Infection (URI) from 14 to 13 days. • Based on this study, is Echinacea an effective treatment for URI’s?
Roll of the Dice • All outcomes are equally likely • The probability of any outcome is 1/6 or 16.7%
Casinos Patrons: Risky Fun Red, White and Blue Slots • 82% chance of loss on any spin • Prizes for a dollar bet range from $2400 to $1 • Patrons are expected to lose $0.10 for each dollar bet
Casinos: False Risk Soaring Eagle • 4300 slot machines • 25 spins per hour • Open 24/7/365 • 94,170,000 possible spins
Statistics vs. Parameters • Statistics: numerical description of a sample • Parameter: numerical description of a population • Statistics are calculated randomly selected members of a population
Differences Between Statistics and Parameters Sample: 3 Randomly Selected People Statistic: 0 of 3 or 0% wear glasses Population: All People Parameter: 5 of 15 or 33% wear glasses
Random Sampling Activity • Number of siblings of each student in the freshman class of Powers Catholic High school • Take 3 samples, with replacement, of sizes 1, 5 and 10 • Calculate the sample mean • Record results in class data chart
Challenge Question • A randomized, double-blind study of 50 subjects shows daily administration of Echinacea supplements shortens the average duration of an Upper Respiratory Infection (URI) from 14 to 13 days. • Based on this study, is Echinacea an effective treatment for URI’s?
Why Do We Need Measures of Variation? • What is the average height of a male child? • How many children are that tall? • When is a child unusually tall or short?
Range • Difference between the maximum and minimum value • Quick to Compute • Not Comprehensive Range = (maximum value) – (minimum value)
Quartiles • Often used in the education field • Can be used with any data distribution • Measures distance in relation to the MEDIAN not MEAN
Quartiles • Q1 (First Quartile)separates the bottom 25% of sorted values from the top 75%. • Q2 (Second Quartile)same as the median; separates the bottom 50% of sorted values from the top 50%. • Q3 (Third Quartile)separates the bottom 75% of sorted values from the top 25%.
25% 25% 25% 25% Q1 Q2 Q3 (minimum) (maximum) (median) Quartiles (2) Q1, Q2, Q3 dividerankedscores into four equal parts
Quartile Statistics • Interquartile Range (or IQR): Q3- Q1
Example • Given the following data calculate Q1, Q2 and Q3 • 4.2, 4.4, 5.1, 5.6, 6.0, 6.4, 6.8, 7.1, 7.4, 7.4, 7.9, 8.2, 8.2, 8.7, 9.1, 9.6, 9.6, 10.0, 10.5, 11.6
Example Continued http://www.maths.murdoch.edu.au/units/statsnotes/samplestats/boxplot.html
(x - x)2 s= n -1 Standard Deviation for a Population • Calculated by the following formula: • Used to show distance from the mean • Tells how usual, or unusual a measurement is =
Standard Deviation for a Sample (x - x)2 s= n -1
Standard Deviation - Important Properties • Standard Deviation is always positive • Increases dramatically with outliers • The units of standard deviation s are the same as the units of the mean
Calculating the Standard Deviation of a SAMPLE • Data points 1, 3, 5, 7, 9
Variance • A measure of variation equal to the square of the standard deviation • Sample Variance = s • Population Variance = 2 2