270 likes | 313 Views
Learn how to compare data sets by examining center and spread measures like mean, median, and deviation in a graphical context. Understand key concepts through examples.
E N D
Introduction Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data set are measures of center. These measures describe the expected value of a data set. The mean absolute deviation is a measure of spread that describes the range of data values, with respect to the mean. The mean absolute deviation is the average of the absolute values of the differences between each data value and the mean. The interquartile range is a measure of spread that describes the range of the middle 50% of a data set. The interquartile range is the difference between the third and first quartiles.
Introduction, continued • The center and spread of a data set can also be seen in the shape of a graphical representation of the data. The range of data values can be seen in the x-axis of a graphical representation. Clusters of data values can be seen in graphs that show frequency, such as dot plots and histograms. The interquartile range and median are shown in box plots.
Key Concepts •Measures of center, such as the mean and median, describe the expected value of a data set. The mean is influenced by very small or large data values, whereas the median is not. •Measures of spread describe the range of data in a set. Interquartile range and mean absolute deviation are measures of spread. •The interquartile range shows the range of the middle 50% of a data set. It is the difference between the third and first quartiles.
Key Concepts, continued •The mean absolute deviation compares data values to the mean of a data set. If the mean absolute deviation is large, this is a sign that the data points are distributed farther from the mean. •Two or more data sets can be compared using measures of center and spread. When choosing a measure of center or spread, identify whether there are very large or very small data values that may influence the mean.
Key Concepts, continued •Data can be compared graphically. The shape of a data set can be seen in a frequency plot, such as a dot plot or histogram. •Data that is symmetric is concentrated toward the middle of the range of data. The data is arranged the same way on both sides.
Key Concepts, continued • Data that is skewed to the right is concentrated toward the lower range of the data; it has a tail to the right. • Data that is skewed to the left is concentrated toward the upper range of the data; it has a tail to the left.
Key Concepts, continued •Data that is widely or evenly distributed has greater variation, and data that clusters around a set of values has less variation. •Data can also be compared using a box plot. The width of the box displays the range of the middle 50% of the data; the width increases as variation increases.
Example 1 • The following data shows the amount of points that the RHHS football team has scored in its first 6 games this season: 28, 49, 39, 35, 20, 39 • The following data shows the amount of points that the RHHS football team scored in its first 6 games last season: 27, 46, 42, 24, 14, 13
Example 1 (Continued) • Describe the measures of center and spread to describe the expected number of points that RHHS will score in its 7th game this season and 7th game last season. (In other words, find the mean, median, and mean absolute deviation for both sets of data and compare them.
Organize your Data! Always remember to organize your data values from smallest to largest. 2012: {20, 28, 35, 39, 39, 49,} 2011: {13, 14, 24, 27, 42, 46,}
Mean • Find the mean for each data set. 2012: {20, 28, 35, 39, 39, 49,} 2011: {13, 14, 24, 27, 42, 46,}
Median • Find the median for each data set. 2012: {20, 28, 35, 39, 39, 49,} 2011: {13, 14, 24, 27, 42, 46,}
Mean Absolute Deviation • Find the mean absolute deviation for each data set. 2012: {20, 28, 35, 39, 39, 49,} 2011: {13, 14, 24, 27, 42, 46,}
Finally… • Let’s make a statement about these two data sets with respect to the center and spread.
Guided Practice Example 2 Each girl in Mr. Sanson’s class and in Mrs. Kwei’s class measured her own height. The heights were plotted on the dot plots below. Use the dot plots to compare the heights of the girls in the two classes. Mrs. Kwei’sClass Mr. Sanson’s Class 4.1.3: Comparing Data Sets
Guided Practice: Example 2, continued Compare the range of recorded values. The overall range of heights of girls in the two classes is similar. The heights in the two classes range from 59 inches to 72 inches, and 60 inches to 72 inches. 4.1.3: Comparing Data Sets
Guided Practice: Example 2, continued Compare the middle values of the data sets. The girls in Mr. Sanson’s class appear to be taller than the girls in Mrs. Kwei’s class. By looking at where the dots are clustered, we can estimate that the middle height in Mr. Sanson’s class is around 67 inches. The middle height in Mrs. Kwei’s class is 65 inches. 4.1.3: Comparing Data Sets
Guided Practice: Example 2, continued Compare the variation in the data sets. The variation in the two sets of heights appears to be similar, except Mr. Sanson’s data is skewed to the left and Mrs. Kwei’s data is skewed to the right. The majority of the heights are within approximately 6 inches in both classes. The majority of the girls in Mr. Sanson’s class are between 64 and 70 inches, and the majority of the girls in Mrs. Kwei’s class are between 61 and 67 inches. ✔ 4.1.3: Comparing Data Sets
Key Concepts, continued 4.1.4: Interpreting Data Sets
Important! • When determining the measure of center, use the mean if there are no outliers. If there are outliers in your data, use the median!
Example 3 • Brady is keeping track of the calories in each snack he eats. He records the number of calories in each snack he eats in a week in the following table. Use the table to answer the following questions.
2. If Brady wants to estimate the number of calories in each snack he eats, which measure of center should he use? Why?
3. Create a box plot using the given data. Between what 2 values does the middle 50% of the data fall?
4. Describe the shape and spread of the data and how it is influenced by outliers if at all.
5. Brady’s doctor recommends that his snacks should be around 200 calories. Is Brady following his doctor’s recommendation?