200 likes | 334 Views
The Statistical Enquiry Cycle Step 4: Analysis Analysing Features. Lesson Objective. To learn how to analyse box and dot plots in a statistical manner. Lesson Success Criteria.
E N D
The Statistical Enquiry Cycle • Step 4: Analysis • Analysing Features
Lesson Objective To learn how to analyse box and dot plots in a statistical manner Lesson Success Criteria • Can describe in statistic language, what the CUSSSO(centre, unusual values, shape, spread, shift, overlap) of a data set is.
Analysis… • This is where you look at the data to see what it tells you about your problem. • i.e. summarising and making sense of the data, so you can answer your question • This will involve • Calculate statistical measures which summarise your data • Central Tendency (averages: mean, median, mode) • Spread (range, interquartile range) • Display the data into relevant tables and graphs • Using “I notice…” statements in relation to your observations of any • patterns, or special features of the data set • Unusual values (outliers) • Symmetry of distribution • Are your predictions still the same, or would you change them? • NOTE: it is important that any statements are supported with evidence
What to Analyse… Start with describing the features you see in the data. These need to be referenced to the graphs and statistics you have calculated. Each statement should start with “I notice that …” Each statement requires numerical evidence. Note: begin each statement on a new line, for easier reading.
What to discuss and compare These areas require your analysis: - all statements start with “I notice…” C – central tendency – look at the data’s, mean, median, and mode – use visible information from boxplot to help U – unusual values – are there any, if so, make statements (if not state this) – look at dot plot, identify and comment upon S – spread – look at range, and IQR (the box – the middle 50%) S – shape – what is the symmetry like (even, or skewed?) – modal, bimodal or no mode – distribution of data (normal, uniform, or other)
What to discuss and compare S– shift – from the boxplot, has one box (middle 50%) moved up the scale in relation to the other box (middle 50%) O– overlap– how much overlap is there between the two boxes (middle 50%)? None, some, a lot, completely So these 6 things are what you need to describe using “I notice…” statements, using numerical evidence to support it. CUSSSO – remember it!
Example “I wonder if there is a difference in the current age of members of the Timbuktu Badminton Club, and the Timbuktu Chess Club?”
Central Tendency Comparing measures of central tendency MEDIAN: directly from the boxplot MEAN: directly from the summary statistics (remember it is a mathematical average, therefore affected by outliers) “…median of those who survived (2.2kg) is greater than those who died (1.6kg)… “…median of guesses for the fall (252), did not differ much to the median of the guesses for the spring (255)…
Unusual Values Are there any? If so, decide, are they outliers or extremes? I notice that from the dot plot for males, one student who had 18 pairs of shoes. I would classify this as an outlier, in comparison to the rest of the males data. I notice that males have an extreme value at 110, and females have two extreme values at 108 and 122.
Spread What do you notice about the spread of the data? RANGE: discuss, as a simple statement, (if outliers are present the range would be smaller if they were excluded (reason why?)) IQR: more important to discuss. Is one clearly larger than the other, or similar in size – what does this mean in the context of the data? I notice that the IQR for boys right-foot length(XX cm) is much smaller than that for girls (YY cm). This could be due to … I notice that the IQR for the boys armspan (XX cm) is very similar to that for girls (YY cm). This is because…
Shape Dot Plot graphs: Comment on both datasets, some common ones are shown below. A symmetric distribution is one that shows symmetry about its centre. Both normal and rectangular (uniform) distributions show symmetry, and some bimodal distributions can as well. Remember – this is related to how spread the data is
Shape Boxplot graphs, what do you notice about Overall shape: - are the length of the whiskers the same length? - if yes, then you can say the overall distribution shows overall symmetry - if no, then overall there is some skewness present Middle 50% shape: - is the median in the middle of the box - if yes, then you can say the middle 50% is symmetrical -if no, then there is some skewness within the middle 50% of the data
Shift This relates to where the box (middle 50%) of one set of data is in relation to the box (middle 50%) of the comparative set of data. I notice that the age of the chess players is shifted towards older ages than the age of the badminton players. This is shown by the middle 50% of chess players ages being higher (or shifted towards higher ages) than for the middle 50% of badminton players. I also notice that the lower 75% of badminton player ages is below the top 50% of the chess player ages.
Overlap What overlap is there between the two boxes (middle 50%) of comparative data? We also need to consider where the median is in relation to this overlap. Some, and medians overlap Some, but medians do not overlap NONE COMPLETE
Practice Chapter 7: page 240 Exercise G : Matching dot plots with box plots