1 / 12

13E – comparing data sets

13E – comparing data sets. So far…. We have looked at different methods of analysing data: Measures of central tendency (mean, median, mode) to look at the middle/average of the data Measures of spread (range, interquartile range/IQR) to look at how far the data is spread out

michi
Download Presentation

13E – comparing data sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 13E – comparing data sets

  2. So far… • We have looked at different methods of analysing data: • Measures of central tendency (mean, median, mode) to look at the middle/average of the data • Measures of spread (range, interquartile range/IQR) to look at how far the data is spread out • We have also looked at ways of summarising and representing the key information about the data: • Five point summaries (Xmin, Q1, median, Q3, Xmax) • Box and whisker plots (visual representation of the five point summary) • Today we are looking at how we can use this information to compare two or more sets of data in order to make judgements and draw conclusions about the data that has been gathered

  3. How can we compare data sets? • We use: • Back-to-back stem and leaf plots to compare the entire data sets (called whole data comparison) • Multiple/parallel box and whisker plots to compare the summarised data (called a summary statistics comparison) • We compare: • Medians and means (to compare the overall/typical/middle/average score) • Ranges and IQRs (to make assessments about the consistency of scores)

  4. Worked example: WHOLE DATA COMPARISON (back to back stem and leaf plot)

  5. a. Median number of customers on weekends/weekdays • We need to find where the medians will be found in each set of data (weekend and weekdays) • n is the same for each set of data (20 weekdays; 20 weekend days) • n = 20 (number of scores, in this case the number of days that were observed) • Median = (n + 1) ÷ 2 = (20 + 1) ÷ 2 = 21 ÷ 2 = 10.5 This tells us that the median for each set of data will be between the 10th and 11th scores (remember to check that the scores are in order) For weekdays, the 10th score = 24 and the 11th score = 25; therefore the median is 24.5 For weekends, the 10th score = 16 and the 11th score is 16; therefore the median is 16

  6. B. Range of customers on weekends/weekdays • Range = Xmax – Xmin • For weekdays, Xmax = 45 and Xmin = 7 Range = 45 – 7 Range = 38 • For weekends, Xmax = 57 and Xmin = 7 Range = 57 – 7 Range = 50

  7. C. What conclusions can be drawn about the average number of customers on weekends and weekdays? • What are our observations based on our analysis of the data? (you can do this part in your head or write some dot points like I have done below) • Reflect on your results relating to the medians and ranges of the data sets, and compare them: • Less customers on weekends than weekdays (weekend has a lower median score) • There is an outlier in the weekend scores (57) which means there is a much bigger range for weekends than weekdays • Weekend scores tend to be clustered towards the lower end • Write a sentence or two to summarise your conclusions: • There are generally fewer customers on weekends. There is one outlier in the weekend scores, causing the range to be larger. However, apart from this outlier, the weekend scores are less spread out.

  8. Worked example: comparing data sets using lists of scores • Below are the scores for two students in eight math tests throughout the year. • John: 45, 51, 55, 58, 59, 62, 62, 64 • Penny: 37, 44, 45, 46, 50, 74, 80, 84 a. Find the mean and range for each student. b. Which student had the better overall performance on the eight tests? c. Which student was more consistent over the eight tests?

  9. a. Find the meanfor each student. • John: 45, 51, 55, 58, 59, 62, 62, 64 = = 57 • Penny: 37, 44, 45, 46, 50, 74, 80, 84 = = 57.5

  10. a. (continued) Find the RANGE for each student. • John: 45, 51, 55, 58, 59, 62, 62, 64 Range = Xmax – Xmin Range = 64 – 45 Range = 19 • Penny: 37, 44, 45, 46, 50, 74, 80, 84 Range = Xmax – Xmin Range = 84 – 37 Range = 47

  11. B. and C. comparing the data b. Which student had the better overall performance on the eight tests? In order to look at overall performance, we compare the means John’s mean = 57 Penny’s mean = 57.5 Therefore, Penny performed slightly better overall. • c. Which student was more consistent over the eight tests? In order to look at consistency, we compare the ranges John’s range = 19 Penny’s range = 47 Because John’s range is lower than Penny’s, we can say that John performed more consistently on the tests than Penny

  12. Questions to do: • Exercise 13E page 455 questions 1 (a and b only), 2, 4, 5, 6(bcde), 8, 12

More Related