300 likes | 438 Views
Statistical Measures. Mrs. Watkins AP Statistics Chapters 5,6. MEASURES OF CENTER. Mean : arithmetic average of all data values population mean : sample mean : Formula : Mode : the most common value in a data set. Median : the middle value in a data set
E N D
Statistical Measures Mrs. Watkins AP Statistics Chapters 5,6
MEASURES OF CENTER Mean: arithmetic average of all data values population mean: sample mean: Formula: Mode: the most common value in a data set
Median: the middle value in a data set Midrange: average of the extremes
Trimmed Mean: when you find the mean of data set with a certain percentage of data values trimmed of the ends of the distribution Ex:
5 number summary 5 important numbers in data set: Min: Q1: Med: Q3: Max: Q1, Med, Q3, may not be actual data values
BOXPLOT graphical display of data using 5 number summary (if outliers shown, called “modified box plot”)
OUTLIERS Outliers: IQR Test for Outliers: (IQR )(1.5) = multiplier M Q1 - M = outlier lower bound Q3 + M = outlier upper bound If values exceed these bounds, they are outliers
RESISTANCE Resistant Measures: Non-resistant Measures: Mean, Midrange: Median, IQR, Trimmed Mean:
MEASURES OF SPREAD Range: the spread between high and low Resistant? IQR (Interquartile Range) : Resistant?
STANDARD DEVIATION a measure of the average amount of deviation from the mean among the data values Population St. Deviation: Sample St. Deviation: We generally use sx because we usually do not have entire population.
VARIANCE the square of the standard deviation what you get before taking square root Population Variance: Sample Variance: This measure not used much in elementary statistics but you need to know what it is.
Coefficient of Variance measure of how relatively large a st. dev. is Ex: St. deviation of IQ = 15, Mean 100 St. deviation of height = 3 in, Mean 69
“Comment on the distribution” You now have numbers to support your statements, rather than just graphs. SHAPE: OUTLIERS: CENTER: SPREAD: how widely does the data vary? Unusual Features: gaps, clusters
SHAPE If the mean > median, then data distribution is skewed ________The mean is in the tail. If the mean < median, then data distribution is skewed ________The mean is in the tail. If the mean ≈ median, then data distribution is approximately ____________.
SHAPE Symmetric if mean = median
SKEWNESS Skewed left if mean < median Skewed right if mean > median Left Right Mean is in the tail of the data
OTHER SHAPES Uniform distribution: allvalues relatively evenly distributed across interval Bimodal distribution: two peaks
TRANSFORMATIONS TO DATA What would happen to the statistical measures if each data value had a constant added to or subtracted from it? Mean: Standard Deviation: Median: IQR:
What would happen to the statistical measures if each data value had a constant multiplied or divided by it? Mean: Standard Deviation: Median: IQR:
TRANSFORMATIONS TO DATA SET What would happen to the statistical measures if one very low or very high data value was added to the set? Mean: Standard Deviation: Median: IQR:
MEASURES OF POSITION Give a numerical approximation of where a single data value stands compared to the whole distribution Quartiles: Percentiles: Z Scores:
Z SCORES standardized score how a single value compares to entire data set in terms of position in distribution z=
How unusual are you? Compute your z score for height? Compute your z score for Math SAT? Compute your z score for IQ?
NORMAL MODEL shows how data is distributed symmetrically along an interval according to empirical rule Empirical Rule: of data within 1 st. deviation of μ of data within 2 st. deviations of μ of data within 3 st. deviations of μ
ANOTHER OUTLIER TEST Using Empirical Rule: Data values of z > +2 st. deviations away from mean are mild outliers Data values of z > +3 st. deviations away from mean are extreme outliers
NORMAL CURVE a theoretical ideal about how traits/characteristics are distributed Many human traits are approximately normally distributed such as height, body temp, IQ, pulse Avoid using “normal” when describing data—say “approximately normal or symmetric” unless clearly mound-shaped, bell-shaped
NORMAL CURVE Normal curve—symmetric, mound-shaped Area under curve= A z score can be used to establish what % of the curve is less or more than the z score, and establish probability of a data value being in that position.
FINDING PERCENTILE/PROBABILITY USING NORMAL CURVE • Calculate z score for data value • Use calculator: normalcdf under DISTR key Looking for area > z score: normalcdf (z, ∞) Looking for area < z score: normalcdf (∞, z) Looking for area between z scores: normalcdf (z1, z2)
FINDING CUT OFF SCORES If you are given a percentile or probability, and need to determine the “cut off score” • Sketch curve to determine where z scoreis located. 2. Determine if you want area above or below this percentile 3. Use INVNORM on calculator invnorm(percentile)= z score • Use z score formula to solve for x.
Does the data fit a normal model? • Check mean and median 2. Make a NORMAL PROBABILITY PLOT— 3. Make a BOXPLOT on calculator. AVOID using histograms on calculator to check.