210 likes | 337 Views
statistics. The term “measures of central tendency” seeks to provide a single value that best represents a distribution. ( refers to finding the mean, median and mode) Mean – average ( the sum of the data divided by the number of data). Don’t round your answer unless directed to do so!!
E N D
The term “measures of central tendency” seeks to provide a single value that best represents a distribution. (refers to finding the mean, median and mode) • Mean – average ( the sum of the data divided by the number of data). Don’t round your answer unless directed to do so!! • Median – middle value, or the mean of the middle two values, when the data is arranged in numerical order. Think of a “median” being in the middle of highway. • Mode – the value (number) that appears the most. It is possible to have more than one mode or no mode.
For a given sample: 33 35 36 37 38 38 38 39 39 39 39 40 40 41 41 45 The mode = 39 MODE • the most frequently occurring score value • corresponds to the highest point on the frequency distribution • mode is not sensitive to score • extreme values do not affect the mode
MEDIAN • the score value that cuts the distribution in half (the “middle” score) • extreme values (outliers) do not affect the median as strongly as they do the mean • useful when comparing data • it is unique – there is only one answer • not as popular as mean • use when data has an outlier The median is the eighth score = 37
__ __ X=36.8 X=36.5 __ X=93.2 MEAN • the mean is sensitive to extreme scores and is appropriate for more symmetrical distributions • most popular measure • unique – only one answer • useful when comparing data • use when data does not have an oulier
What are the mean, median, and mode of the bowling scores below.? Which measure of central tendency best describes the scores? Bowler 1: 104 Bowler 5: 189 Bowler 2: 117 Bowler 6: 109 Bowler 3: 104 Bowler 7: 113 Bowler 4: 136 Bowler 8: 104 Consider the scores from above that do not include the outlier, 189. What are the mean, median, and mode of the scores? Which measure of central tendency best describes the data now?
BOX AND WHISKER PLOTS • A box-and-whisker plot is a method for displaying data graphically, explicitly showing the median of the data, as well as the upper and lower quartile values. • Displaying data sets using a box-and-whisker plot is a convenient way to summarize large amounts of data. • A box-and-whisker plot uses five special values to give a graphic picture of a set of data. • These five values are the median, the upper quartile, the lower quartile, the upper extreme, and the lower extreme.
The left whisker extends from the minimum to the first quartile. It represents 25% of the data. • The box extends from the first quartile to the third quartile and has a vertical line through the median. The length of the box represents the interquartile range. It contains about 50% of the data. • The right whisker extends from the third quartile to the maximum. It represents about 25% of the data.
A researcher traveled to a variety of stores in a metropolitan area checking on the price of a gallon of milk. The prices, in dollars, are given below. 2.53, 2.45, 2.42, 2.40, 2.50, 2.40, 2.39, 2.46, 2.48, 2.42, 2.42, 2.44, 2.40, 2.49, 2.45, 2.42 Find the lower extreme, lower quartile, median, upper quartile, and the upper extreme values of the data set.
Write the prices in order from least to greatest. • Read the lower and upper extremes from the list. Lower extreme: 2.29 and upper extreme: 2.53. • Find the median of the data values. Median: 2.43. • Find the upper and lower quartile values. The median separates the data into two halves. • The lower quartile is the median of the lower half of the data. The lower quartile value is 2.41. • The upper quartile is the median of the upper • half of the data. The upper quartile value is 2.47.
So, the five special values for the milk data are: • Lower extreme: 2.39 • Lower quartile: 2.41 • Median: 2.43 • Upper quartile: 2.47 • Upper extreme: 2.53 • To draw a box-and-whisker plot for the set of data, first draw a number line. • Above the number line, plot each of the five special values identified above.
SHAPE OF DATA • a symmetrical distribution exhibits no skewness • in a symmetrical distribution the Mean = Median = Mode
median mode mean • Skewed data is slanted to the right or the left. • If data starts “high” and ends “low” then it is skewed to the right. • If data starts “low” and ends “high” then it is skewed to the left. • mode > median > mean
MEASURES OF SPREAD • the dispersion of scores from the center • a distribution of scores is highly variable if the scores differ widely from one another • Three statistics to measure variability • range • interquartile range • variance
Range: the largest score subtracted from the smallest score. • Interquartile Range: distance between quarter 1 and quarter 3 • (Q3 – Q1), ignores the bottom quarters so extreme scores are not influential, dismisses 50% of the distribution. • Variance: the average of the squares of the distance each value is from the mean (how much the data set varies from its mean). • https://www.youtube.com/watch?v=Cx2tGUze60s
VARIANCE • Find the mean. • Subtract the mean from every score. • Square the deviations. • Sum the squared deviations. • Divide the SS by N.
STANDARD DEVIATION The standard deviation is the square root of the variance. The standard deviation is calculated to find the average distance from the mean. The standard deviation is used to tell how far on average any data point is from the mean. The smaller the standard deviation, the closer the scores are on average to the mean. When the standard deviation is large, the scores are more widely spread from the mean. https://www.youtube.com/watch?v=4ffBQPCiGms