340 likes | 489 Views
S1: Chapters 2-3 Data: Location and Spread. Dr J Frost (jfrost@tiffin.kingston.sch.uk) . Last modified: 20 th September 2013. Types of variables. In statistics, we can use a variable to represent some quantity, e.g. height, age.
E N D
S1: Chapters 2-3Data: Location and Spread Dr J Frost (jfrost@tiffin.kingston.sch.uk) Last modified: 20th September 2013
Types of variables In statistics, we can use a variable to represent some quantity, e.g. height, age. This could be qualitative (e.g. favourite colour) or quantitative (i.e. numerical). Variables are often used differently in statistics than they are in algebra. In statistics, this would mean: “Sum over the values of the variable we’re collected (i.e. our data).” 2 types of variable: Discrete variables Continuous variables Has specific values. e.g. Shoe size, colour, website visits in an hour period, number of siblings, … ? Can have any value in a range. e.g. Height, distance, weight, time, wavelength, … ?
Quartiles for large numbers of items What item do we use for each quartile when LQ Median UQ ? ? ? 8th 31 16th 24th ? ? ? 15th 19 10th 5th ? ? ? 5th 6 2nd 3rd and 4th ? ? ? 11th 14 4th 7th and 8th Under what circumstances do we not round? ? When we have a grouped frequency table involving a continuous variable.
Notation for quartiles/percentiles ? Lower Quartile: Median: ? Upper Quartile: ? 57th Percentile: ?
Grouped Frequency Data Recap This type of data is continuous. ? Estimate of Mean: ? ? ? ? The midpoints of each interval. They‘re effectively a sensible single value used to represent each interval. What does the variable represent? Why the ‘bar’ (horizontal line) over the ? It’s the sample mean of . It indicates that our mean is just based on a sample, rather than the whole population. ? Why is our mean just an estimate? ? Because we don’t know the exact heights within each group. Grouping data loses information.
Grouped Frequency Data Recap ? Modal class interval: (‘modal’ means ‘most’) Median class interval: There are 40 items, so determine where 20th item is. ?
What’s different about the intervals here? What interval does this actually represent? ? Lower class boundary Upper class boundary Class width = 3 ?
Identify the class width ? ? Lower class boundary = 3.5 Lower class boundary = 200 Class width = 3 ? Class width = 10 ? ? ? Lower class boundary = 29 Lower class boundary = 30.5 ? Class width = 2 Class width = 10 ?
S2 – Chapters 2/3 Interpolation
Estimating the median GCSE Question Answer = 13.5 + 8 = 21.5 ?
Estimating the median At GCSE, you were only required to give the median class interval when dealing with grouped data. Now, we want to estimate a value within that class interval. (Why not the 11.5 item?) Frequency up until this interval Frequency at end of this interval Item number we’re interested in. ? ? ? 9 18 11 15.5kg ? 18.5kg ? ? Weight at start of interval. Weight at end of interval. Median ?
Estimating other values LQ ? UQ ? 34th Percentile ?
You should have a sheet in front of you years 1a ? 1b ? years 1c ? years 1d ? Interquartile Range: years 2a ? cm 2b ? cm 2c ? cm
Exercises Page 34 Exercise 3A Q4, 5, 6 Page 36 Exercise 3B Q1, 3, 5
S2 – Chapters 2/3 Variance and Standard Deviation
What is variance? Distribution of IQs in L6Ms5 Distribution of IQs in L6Ms4 Here are the distribution of IQs in two classes. What’s the same, and what’s different?
Variance Variance is how spread out data is. Variance, by definition, is the average squared distance from the mean. Distance from mean… Squared distance from mean… Average squared distance from mean…
Simpler formula for variance Variance “The mean of the squares minus the square of the means (‘msmsm’)” ? ? Standard Deviation The standard deviation can ‘roughly’ be thought of as the average distance from the mean.
Starter Calculate the variance and standard deviation of the following heights: 2cm 3cm 3cm 5cm 7cm Variance cm ? Standard Deviation cm ?
Practice Find the variance and standard deviation of the following sets of data. ? ? Standard Deviation = Variance = ? Standard Deviation = Variance = ?
Extending to frequency/grouped frequency tables We can just mull over our mnemonic again: Variance: “The mean of the squares minus the square of the means (‘msmsm’)” ? ? Bro Tip: It’s better to try and memorise the mnemonic than the formula itself – you’ll understand what’s going on better, and the mnemonic will be applicable when we come onto random variables in Chapter 8.
Example ? ? ? ?
Exercises Page 40 Exercise 3C Q1, 2, 4, 6 Page 44 Exercise 3D Q1, 4, 5
Recap ? ? ? ?
S2 – Chapters 2/3 Coding
Starter What do you reckon is the mean height of people in this room? Now, stand on your chair, as per the instructions below. INSTRUCTIONAL VIDEO Is there an easy way to recalculate the mean based on your new heights? And the variance of your heights?
Starter Suppose now after a bout of ‘stretching you to your limits’, you’re now all 3 times your original height. What do you think happens to the standard deviation of your heights? It becomes 3 times larger (i.e. your heights are 3 times as spread out!) ? What do you think happens to the variance of your heights? It becomes 9 times larger ? (Can you prove the latter using the formula for variance?)
The point of coding Cost of diamond ring (£) £1010 £1020 £1030 £1040 £1050 We ‘code’ our variable using the following: New values : ? £1 £2 £3 £4 £5 Standard deviation of (): therefore… Standard deviation of (): ? ?
Finding the new mean/variance Old mean Old variance Coding New mean New variance ? ? ? ? ? ? ? ? ? ? ? ?
Exercises Page 26 Exercise 2E Q3, 4 Page 47 Exercise 3E Q2, 3, 5, 7
Chapters 2-3 Summary • I have a list of 30 heights in the class. What item do I use for: • ? 8th • ? Between 15th and 16th • ? 23rd ? ? ? For the following grouped frequency table, calculate: ? a) The estimate mean: ? b) The estimate median: ? c) The estimate variance: (you’re given )
Chapters 2-3 Summary What is the standard deviation of the following lengths: 1cm, 2cm, 3cm ? The mean of a variable is 11 and the variance . The variable is coded using . What is: The mean of ? The variance of ? A variable is coded using . For this new variable , the mean is 15 and the standard deviation 8. What is: The mean of the original data? The standard deviation of the original data? ? ? ? ?