310 likes | 443 Views
Business and Academic Skills. Describing data. Describing data. Often within a business data is collected from many and various sources . Need to describe the data and give it meaning. If we can give it real meaning then managers may be able to use it in making decisions. Describing data.
E N D
Business and Academic Skills Describing data
Describing data • Often within a business data is collected from many and various sources . • Need to describe the data and give it meaning. • If we can give it real meaning then managers may be able to use it in making decisions.
Describing data • 216, 224, 2393, 2, 6, 77, 84, 7, 5, 3, 213, 242, 259, 214, 237, 217, 258, 218, 234, 211, 8, 64, 9, 276,223, 245, 212, 234, 264, 257, 278, 210, 9, 4, 114, 326, 441, 11, 242, 259, 514, 35, 119, 268, 208, 534, 11, 17, 186, 21, 3245, 49, 4, 2, 110, 236, 364, 160, 375, 210, 476, 98, 12, 134, 908, 765, 456, Without undertaking any calculations or drawing any pictures: In less than ten words accurately explain this group of data.
Compare the following 2 groups of data Group 1 Group 2 216, 224, 239, 213, 242, 259, 214, 237, 217, 258, 218, 234, 211, 276, 223, 245, 212, 234, 264, 257, 278, 210, • 8, 6, 4, 9, 3, 2, 9, 4, 6,7, 7, 8, 4, 1, 7, 9, 4, 2,4, 7, 5, 3, Without undertaking any calculations or drawing any pictures: In less than ten words accurately explain the differences between the 2 groups.
Describing Groups of Data • Possible issues • Often too much data to easily understand. • Using just words can loose accuracy, and create generalities. • Open to different interpretations • Possible actions • Put into tabular form. • Draw appropriate pictures. (Graphs) • Have some conventions for describing the data
Crisp Problem • You love potato crisps. You eat 1 packet per day with your lunch time sandwich. You equally like two different brands which cost the same amount of money per packet. • Each brand is running a loyalty promotion so that if you collect 30 vouchers over the next 30 working days they will give you a £5 note. The voucher is in the form of a stamp given to you by the canteen each time you buy the crisps. • You can only get one voucher stamp per day. • How will you decide which brand to buy? • List the factors which might affect you decision.
Crisp Problem • To help the following information is made available. • Brand A Number crisps per packet 60 Variation on number of crisps + 2 per packet. • Brand B Number crisps per packet 58 Variation on number of crisps + 8 per packet. • Does this information help, if so how?
Describing data • Groups of data can be described by 4 main factors. • Shape of the data (shown by the picture of the data) • How big are the numbers • Measurement of a typical value. • Measurement of central tendency • Measurement of the spread of the data • How many pieces of data are there in the group
Frequency distribution (Example 1) Frequency
Frequency distribution (example 2) Skewed to lower values Skewed to Higher values
Our interest in graphs • What are we really interest in when we produce a graph? • The shape of the graph: • Tall thin • Short fat • Regular shape • Skewness • Interpretation of the shape if we had a large quantity of data
The number scale • The number scale can be represented by a straight line going from –ve to +ve with 0 in the middle. • The number scale helps determine size and whether one value or values is greater or less than another. • By placing groups of data on the scale we can see if one group has larger values than another. Group 1 Group 2 Group 3 0
The number scale • If the group is diverse then may cover large amount of the scale need a single point to establish the position of the group on the scale. Some form of typical value. • Referred to as the Measurement of Central Tendency for that group of which there are 2 main measurements. • Arithmetic mean (Average) • Median
The number scale • If the group is diverse then may cover large amount of the scale need a single point to establish the position of the group on the scale. Some form of typical value. • Referred to as the Measurement of Central Tendency for that group of which there are 3 main measurements. • Arithmetic mean (Average) • Median • Mode
Measurement of central tendency • Arithmetic Mean (average) {excel function =average(values)} • Defined as: The sum of all the values divided by the number of values.(Note This will be a derived value determined by the formula for the arithmetic mean) • Median {excel function =median(values)} • Defined as:The physical middle value when the values are placed in order. (Note Excel automatically deals with the sort within the function)
Measurement of central tendency • Arithmetic Mean (average) {excel function =average(values)} • Defined as: The sum of all the values divided by the number of values.(Note This will be a derived value determined by the formula for the arithmetic mean)
Measurement of central tendency • Median {excel function =median(values)} • Defined as:The physical middle value when the values are placed in order. (Note For an even number of values the median lines in the middle of the 2 numbers physically at the centre)
Spread of data • The measure of central tendency positions the group of data on the number scale. • However differing groups of data can have the same or similar measures. • A measurement of the spread of the data helps us to understand some more about the data. • Three main measures of spread • Range of the data • Inter-quartile range (used with the median) • The standard deviation (used with the mean)
Range • The range of a group of data is the difference between the value of the highest and the lowest value within the group. • There is no direct excel function to measure the range thus we need to establish the maximum and minimum values and subtract one from the other. • We can use the following formula for the range:=(max(group reference)-(min(group reference)))
Range Example group 1 This in effect tells us the total width of the data group. Tells us whether the total group is close together or far apart.
Inter-quartile range • The median is the physical middle value when the values are placed in order. • If the data is placed in order it is possible to obtain the value at any point in the group range. • A Percentile is the mechanism for obtaining this value. • The group data is broken normally into 100 equal intervals, and a percentile returns the value at the require point. • For example : given the numbers 0,1,2,3,4,5,6,7,8,9,10 in a group • The value of 10% percentile = 1 • The value of 25% percentile =2.5 etc
Inter-quartile range The excel function is =percentile(range of data, Percentile point) Thus for group 1 the 25% percentile would be :
Inter-quartile range • The lower quartile is the value of the 25% (0.25)percentile • The upper quartile is the value of the 75% (0.75)percentile • The inter-quartile range is defined as the difference between the values of the upper and lower quartile. • Thus this tells us the range of the middle 50% of the data
Inter-quartile range • Again there is no direct excel function but the value can be calculated using the following formula:=(percentile(range data,0.75)-percentile(range data,0.25))
Inter-quartile range This result indicates that middle 50% of the data lies close together . Should be considered with the median.
Standard deviation • The arithmetic mean is the most commonly used measurement of central tendency. • To measure the spread of the data we use the standard deviation. • Unlike the other two values this is not a measurement of range • It is a measurement of how well the data is clustered around the arithmetic mean value. • The lower the value the closer the clustering
Standard deviation • The standard deviation is a derived statistical formula. • This formula can vary depending on the way that he data has been collected, however if you a reasonable amount of data all the different formula’s give roughly the same answer. • Any error due to using the wrong formula is most likely very small compared with other errors involved in the total process. • The excel formula we shall use is =stdevp(range)
Measures of dispersion (Inter quartile Range) Inter-quartile Range
Measures of dispersion (Standard deviation) 33% Standard deviation