1 / 41

Measures of Position Where does a certain data value fit in relative to the other data values?

Measures of Position Where does a certain data value fit in relative to the other data values?. To accompany Hawkes lesson 3.3 Original content by D.R.S. N th Place. The highest and the lowest 2 nd highest, 3 rd highest, etc. “If I made $60,000, I would be 6 th richest.”.

maitland
Download Presentation

Measures of Position Where does a certain data value fit in relative to the other data values?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measures of PositionWhere does a certain data value fit in relative to the other data values? To accompany Hawkes lesson 3.3 Original content by D.R.S.

  2. Nth Place • The highest and the lowest • 2nd highest, 3rd highest, etc. • “If I made $60,000, I would be 6th richest.”

  3. Another view: “How does my compare to the mean?” • “Am I in the middle of the pack?” • “Am I above or below the middle?” • “Am I extremely high or extremely low?” • Score is the measuring stick

  4. Score: is how many standard deviations away from the mean? If you know the x value To work backward from z to x Population Sample • Population: • Sample

  5. score is also called “Standard Score” • No matter what is measured in or how large or small the values are…. • The score of the mean will be 0 • Because numerator turns out to be 0. • If is above the mean, its is positive. • Because numerator turns out to be positive • If is below the mean, its is negative. • Because numerator turns out to be negative

  6. score values • Typically round to two decimal places. • Don’t say “0.2589”, say “0.26” • If not two decimal places, pad • Don’t say “2”, say “2.00” • Don’t say “-1.1”, say “-1.10” • scores are almost always in the interval . Be very suspicious if you calculate a score that’s not a small number.

  7. Practice: Given x, compute z Find the scores corresponding to the salary values, given that the mean, and the standard deviation .

  8. Practice: Given z, compute x Find the scores (salaries) corresponding to these standard scores, given that the mean, and the standard deviation . • and • and • and

  9. Two parallel axes (scales), and

  10. Example: Using scores to compare unlike items The Literature test The Biology test The mean score was 47 points The standard deviation was 6 points Sue earned 55 points Find her z score for this test On which test did she have the “better” performance? • The mean score was 77 points. • The standard deviation was 11 points • Sue earned 91 points • Find her z score for this test

  11. scores caution with negatives • Example: compare test scores on two different tests to ascertain “Which score was the more outstanding of the two?” • Be careful if the scores turn out to be negative. Which is the better performance? or ? • Stop and think back to your basic number line and the meaning of “<“ and “>”

  12. Percentiles • “What percent of the values are lower than my value?” • 90th percentile is pretty high • 50th percentile is right in the middle • 10th percentile is pretty low • If you scored in the 99th percentile on your SAT, I hope you got a scholarship.

  13. Salary data for our percentile examples • With these salary values again • What’s thepercentile for a salary of $59,000 ? • You can see it’s going to be higher than 50th Because it’s in the top half.

  14. Example: Given x, find the percentile • Count = how many values below $59,000 • Count = how many values in the data set • Formula for percentile • Here we have values lower than our $59,000 • Here we have values in the data set. • so , “75th percentile”

  15. Continued: Given x, find the percentile • so • Do not say “75%”, but say “the 75th percentile” • Other sources use different formulas, beware! • Some other books use in the numerator. • Excel has two different answers, PERCENTILE.EXC and PERCENTILE.INC functions.

  16. Given Percentile , find the value • Formula: position from bottom • Again, how many data values in the set • and the percentile rank that’s given. • Is there a decimal remainder in position ? • If so, then BUMP UP to the next highest whole # and take the value in that position. • Or if is an exact whole number, take the average from positions and . • Note: Book uses lowercase instead of .

  17. Given Percentile , find the value • Example: What is the 31st percentile in the salary data? • 31st percentile: plug in • Compute . It has a remainder. • Bump it up! 7. • Not rounding, but rather bumpety-upping • So we look 7 positions from the bottom • “The 31st percentile is $44,476”

  18. Given Percentile , find the value • Example: What is the 40th percentile in the salary data? Plug in • Compute . Exact integer! • So count 8th and 9th from bottom. • “The 40th percentile is $47,367.50, or $47,368.”

  19. Excel gives different answers • Excel does some fancy interpolation

  20. Quartiles Q1, Q2, Q3 • Data values are arranged from low to high. • The Quartiles divide the data into four groups. • Q2 is just another name for the Median. • Q1 = Find the Median of Lowest to Q2 values • Q3 = Find the Median of Q2 to Highest values • It gets tricky, depending on how many values.

  21. Quartiles example • 10, 20, 30, 40, 50, 60, 70, 80, 90 • The Second Quartile, Q2 = median = 50 • Find the medians of the subsets left and right. • Keep the 50 in each of those subsets. • The First Quartile, Q1= median of { 10, 20, 30, 40, 50 } = 30 • The Third Quartile, Q3= median of { 50, 60, 70, 80, 90 } = 70

  22. Quartiles example • 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 • Q2 = median =. (two middle #s) • Leave the 50 and 60 in place; do not reuse 55 • Q1 = median of {10, 20, 30, 40, 50} = 30 • Q3 = median of {60, 70, 80, 90, 100} = 80

  23. Quartiles example • 0, 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 110 • Q2 = median = (two middle #s). • 55 isn’t really there so you can’t remove it! • Leave the 50 and 60 in place • Q1 = median of {0, 10, 20, 30, 40, 50} = 25 • Q3 = median of {60, 70, 80, 90, 100, 110} = 85 • Two middle numbers happened again!

  24. Interquartile Range • Definition: IQR = Q3 – Q1 • In the previous example, 85 – 25 = 60. • Interquartile Range measures how spread out the middle of the data are • The lowest quartile (x < Q1) is not involved • And the highest quartile (x > Q3) is not involved.

  25. Quartiles with TI-84 • 0, 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 110 • Put values into a TI-84 List • Use STAT, CALC, 1-Var Stats • Scroll down downdown to get to them.

  26. There is disagreement about Quartiles • The TI-84 sometimes gives different answers than the method we use in the Hawkes materials • Excel might give different answers from Hawkes and TI-84, both. • Use the Hawkes method in this course’s work • Be aware of the others • You should know how to use TI-84 and Excel • You should be aware that differences can occur.

  27. Quartiles with TI-84 vs. Hawkes • 10, 20, 30, 40, 50, 60, 70, 80, 90 • We got Q1=30 and Q3=70 before. • Hawkes keeps the 50,using 10,20,30,40,50to compute Q1. • But the TI-84 throwsout 50 and uses 10,20,30,40. • Hawkes says the TI-84 is computing “hinges”.

  28. Quartiles in Excel • =QUARTILE.INC(cells, 1 or 2 or 3) seems to give the same results as the old QUARTILE function • There’s new =QUARTILE.EXC(cells, 1 or 2 or 3) • Excel does fancy interpolation stuff and may give different Q1 and Q3 answers compared to the TI-84 and our by-hand methods.

  29. The Five Number Summary • Again: 0, 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 110 • Q2 = median =, Q1 = 25and Q3 = 85 • “The Five Number Summary” is defined as: the minimum, then Q1, Q2, Q3, then the maximum • For this set of numbers, the Five Number Summary is “0, 25, 55, 85, 110”

  30. The Five Number Summary • Again: 0, 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 110 • Q2=55, Q1=25, Q3 = 85 • Min is 0, Max is 110 • For this set of numbers, the Five Number Summary is “0, 25, 55, 85, 110” • Box Plot • TI-84 can do Box Plot too, but again its quartiles disagree with the way Hawkes defines quartiles. Min Q1 Q2 Q3 Max 0 25 55 85 110

  31. Why Box Plot? • Don’t lose sight of the big picture here: • We have a data set • It’s a bunch of numbers • We want to summarize the data • Summarize means make it into a sound bite • We must be Concise – don’t say too much • We must be Informative – don’t say too little

  32. We must be Concise • Bad: “Here is a report that tells you the mean and the variance and the standard deviation and the quartiles and the percentiles from 0 to 100… and the marketing survey analyzed by demographic subgroups …” (there is a place for that, but not right now) • Good: “Got fifteen seconds? Here’s what we found.”

  33. Notice the pieces of the boxplot: • Horizontal scale, maybe a little beyond the min and the max. A generic number line. • The five numbers. • The box holds the quartiles • With a line in the middle at the median. • The whiskers extend out to the min and the max.

  34. TI-84 Boxplot • See instructions on separate handout. • Caution again that TI-84 computes quartiles differently from Hawkes and differently from Excel, so the results aren’t always going to agree.

  35. Additional Topics • Might not be needed for Hawkes homework • But you should be aware of them • Quintiles and Deciles • Interquartile Range and Outliers • TI-84 Box Plot

  36. Quintiles and Deciles • You might also encounter • Quintiles, dividing data set into 5 groups. • Deciles, dividing data set into 10 groups. • Reconcile everything back with percentiles: • Quartiles correspond to percentiles 25, 50, 75 • Deciles correspond to percentiles 10, 20, …, 90 • Quintiles correspond to percentiles 20, 40, 60, 80

  37. Interquartile Range and Outliers • Concept: An OUTLIER is a wacky far-out abnormally small or large data value compared to the rest of the data set. • We’d like something more precise. • Define: IQR = Interquartile Range = Q3 – Q1. • Define: If , is an Outlier. • Define: If , is an Outlier. • (Other books might make different definitions)

  38. Outliers Example • Here’s an quick elementary example: • Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20 • Mean and • Or in Hawkes method, , , and we still get interquartile range = (it won’t always work out the same but in this case the IQR is the same either way)

  39. Outliers Example • Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20 • We found IQR = 6 and the mean is 6.8 • One definition uses to define outliers • Here, • Anything more than 9 units away from is then considered to be abnormally small or large. • , nothing smaller than • : the 20 is an outlier.

  40. No-Outliers Example • Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10 • Mean and (coincidence that , insignificant) • Anything more than 9 units away from is abnormal. • This data set has No Outliers.

  41. Outliers: Good or Bad? • “I have an outlier in my data set. Should I be concerned?” • Could be bad data. A bad measurement. Somebody not being honest with the pollster. • Could be legitimately remarkable data, genuine true data that’s extraordinarily high or low. • “What should I do about it?” • The presence of an outlier is shouting for attention. Evaluate it and make an executive decision.

More Related