1 / 55

Breakout Session #1 Graphical Statistics

Breakout Session #1 Graphical Statistics. Presented by Dr. Del Ferster. What’s in store for today?. We’ll start by doing a needs assessment. Where do you want or need more information regarding the topics for this year’s work. We’ll spend a bit of time looking at some “test-type” problems.

kairos
Download Presentation

Breakout Session #1 Graphical Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Breakout Session #1Graphical Statistics Presented by Dr. Del Ferster

  2. What’s in store for today? • We’ll start by doing a needs assessment. • Where do you want or need more information regarding the topics for this year’s work. • We’ll spend a bit of time looking at some “test-type” problems. • We’ll take another look at graphical statistics. • Different types of plots • 2 column frequency tables. • I know, it sounds like a blast! 

  3. Let’s do some problems!  Practice Problems GRAPHICAL STATISTICS

  4. Get your popcorn ready! TODAY’S FEATURE PERFORMANCE GRAPHICAL STATISTICS

  5. Descriptive Statistics:Tabular and Graphical Presentations • Summarizing Qualitative Data • Summarizing Quantitative Data • Recall • Qualitative = Essentially just a name. • Quantitative = True numerical data.

  6. We Deal with 2 Types of Data • Numerical/Quantitative Data [Real Numbers]: • Your height • The number of people in your family • temperature of coffee bought at McDonalds • The score on your last math test • Qualitative/Categorical Data [Labels rather than numbers]: • grade of a High School student[F, S, J, Senior] • favorite color • Political party affiliation • the part of a new automobile that breaks first • the reason you get mad at your spouse

  7. Summarizing Qualitative Data • Frequency Distribution (shows how many) • Relative Frequency Distribution (shows what fraction) • Percent Frequency Distribution (shows what percentage) • Bar Graph • Pie Chart • Both these are graphical means for displaying any of above.

  8. Frequency Distribution • A frequency distributionis a tabular summary of data showing the frequency (or number) of items in each of several non-overlapping classes. • The objective is to provide insightsabout the data that cannot be quickly obtained by looking only at the original data.

  9. Example: Stumble Inn Guests staying at Stumble Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 guests are shown below.

  10. Example: Stumble Inn • Frequency Distribution RatingFrequency Poor 2 Below Average 3 Average 5 Above Average 9 Excellent 1 Total 20

  11. Relative Frequency Distribution • The relative frequencyof a class is the fraction or proportion of the total number of data items belonging to the class. • A relative frequency distributionis a tabular summary of a set of data showing the relative frequency for each class.

  12. Percent Frequency Distribution • The percent frequencyof a class is the relative frequency multiplied by 100. • Apercent frequency distributionis a tabular summary of a set of data showing the percent frequency for each class.

  13. Example: Stumble Inn • Relative Frequency and Percent Frequency Distributions Relative Percent RatingFrequency Frequency Poor .10 10 Below Average .15 15 Average .25 25 Above Average .45 45 Excellent .05 5 Total 1.00100

  14. Bar Graph • A bar graphis a graphical device for depicting qualitative data. • On the horizontal axis we specify the labels that are used for each of the classes. • A frequency, relative frequency, or percent frequencyscale can be used for the vertical axis. • Using a bar of fixed widthdrawn above each class label, we extend the height appropriately. • The bars are separatedto emphasize the fact that each class is a separate category.

  15. 9 8 7 6 Frequency 5 4 3 2 1 Rating Above Average Excellent Poor Below Average Average Example: Stumble Inn • Bar Graph

  16. Pie Chart • The pie chartis a commonly used graphical device for presenting relative frequency distributions for qualitative data. • First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class. • Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.

  17. Exc. 5% Poor 10% Below Average 15% Above Average 45% Average 25% QualityRatings Example: Stumble Inn • Pie Chart

  18. Example: Stumble Inn • Insights Gained from the Preceding Pie Chart • One-half of the customers surveyed gave Stumble Inn a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might please the manager. • For each customer who gave an “excellent” rating, there were two customers who gave a “poor” rating (looking at the top of the pie). This should displease the manager.

  19. Summarizing Quantitative Data • Frequency Distribution • Relative Frequency and Percent Frequency Distributions • Dot Plot • Histogram • Cumulative Distributions • Ogive

  20. Example: RPM Auto Repair The manager of RPM Auto Repair would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. He examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.

  21. Example: RPM Auto Repair • Sample of Parts Cost for 50 Tune-ups • Including a line in the table for every possible cost is not a good idea. • We need to categorize the data.

  22. Frequency Distribution • Guidelines for Selecting Number of Classes • Use between 5 and 20 classes • Smaller data sets usually require fewer classes • Data sets with a larger number of elements usually require a larger number of classes. • Note that the upper limit of every class is also the lower limit of the next class. • We treat the upper limit as OPEN (or Up to that amount)

  23. Frequency Distribution • Guidelines for Selecting Width of Classes • Use classes of equal width. • Approximate Class Width =

  24. Frequency Distribution • For RPM Auto Repair, if we choose 6 classes: Approximate Class Width = 50-60 60-70 70-80 80-90 90-100 100-110 2 13 16 7 7 5 Total 50 Parts Cost ($) Frequency

  25. Relative Frequency andPercent Frequency Distributions Relative Frequency Percent Frequency Parts Cost ($) 50-60 60-70 70-80 80-90 90-100 100-110 .04 .26 .32 .14 .14 .10 Total 1.00 4 26 32 14 14 10 100 2/50 .04(100)

  26. Relative Frequency andPercent Frequency Distributions • For the RPM Motors Data, we can make the following observations. • Only 4% of the parts costs are in the $50-60 class. • 30% of the parts costs are under $70. • The greatest percentage (32% or almost one-third) of the parts costs are in the $70-80 class. • 10% of the parts costs are $100 or more.

  27. Dot Plot • One of the simplest graphical summaries of data is a dot plot. • A horizontal axis shows the range of data values. • Then each data value is represented by a dot placed above the axis.

  28. . . .. . . . . .. .. .. .. . . . . . ..... .......... .. . .. . . ... . .. . 5060708090100110 Cost ($) Example: RPM Auto Repair • Dot Plot

  29. Histogram • Another common graphical presentation of quantitative data is a histogram. • The variable of interest is placed on the horizontal axis. • A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, relative frequency, or percent frequency. • Unlike a bar graph, a histogram has no natural separation between rectanglesof adjacent classes.

  30. Example: Hudson Auto Repair • Histogram 18 16 14 12 Frequency 10 8 6 4 2 Parts Cost ($) 50 60 70 80 90 100 110

  31. Cumulative Distributions • Cumulative frequency distribution -- shows the number of items with values less than or equal to the upper limit of each class. • Cumulative relative frequency distribution -- shows the proportion of items with values less than or equal to the upper limit of each class. • Cumulative percent frequency distribution -- shows the percentage of items with values less than or equal to the upper limit of each class.

  32. Example: Hudson Auto Repair • Cumulative Distributions Cumulative Cumulative Cumulative Relative Percent Cost ($)FrequencyFrequencyFrequency < 60 2 .04 4 < 70 15 .30 30 < 80 31 .62 62 < 90 38 .76 76 < 100 45 .90 90 <110 50 1.00 100

  33. Exploratory Data Analysis • The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. • One such technique is the stem-and-leaf display.

  34. Stem-and-Leaf Display • A stem-and-leaf display shows both the rank orderand shape of the distributionof the data. • It is similar to a histogramon its side, but it has the advantage of showing the actual data values. • The first digits of each data item are arranged to the left of a vertical line. • To the right of the vertical line we record the last digit for each item in rank order. • Each line in the display is referred to as a stem. • Each digit on a stem is a leaf. 8 5 7 9 3 6 7 8

  35. Stem-and-Leaf Display • Leaf Units • A single digit is used to define each leaf. • In the preceding example, the leaf unit was 1. • Leaf units may be 100, 10, 1, 0.1, and so on. • Where the leaf unit is not shown, it is assumed to equal 1.

  36. Example: Hudson Auto Repair • Stem-and-Leaf Display 5 2 7 6 2 2 2 2 5 6 7 8 8 8 9 9 9 7 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 8 0 0 2 3 5 8 9 9 1 3 7 7 7 8 9 10 1 4 5 5 9

  37. SPLIT STEM Stem-and-Leaf Display • If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the displayby using two more stems for each leading digit(s). • Whenever a stem value is stated twice, the first value corresponds to leaf values of 0-4, and the second values corresponds to values of 5-9.

  38. Example: Hudson Auto Repair • SPLIT STEM Stem and Leaf Plot 5 2 5 7 6 2 2 2 2 6 5 6 7 8 8 8 9 9 9 7 1 1 2 2 3 4 4 7 5 5 5 6 7 8 9 9 9 8 0 0 2 3 8 5 8 9 9 1 3 9 7 7 7 8 9 10 1 4 10 5 5 9

  39. 2 Way Data Tables • Thus far we have focused on methods that are used to summarize the data for one variable at a time. • Often we are interested in tabular and graphical methods that will help understand the relationship between two variables. • 2 Way Data Tables and scatter diagramsare two methods for summarizing the data for two (or more) variables simultaneously.

  40. 2 Way Data Tables • 2 way data tables are used to summarize the data for two variables simultaneously. • 2 way data tables can be used when: • One variable is qualitative and the other is quantitative • Both variables are qualitative • Both variables are quantitative • The left and top margin labels define the classes for the two variables.

  41. Example: Finger Lakes Homes • 2 Way Data Tables The number of Finger Lakes homes sold for each style and price for the past two years is shown below. Price Home Style RangeColonial Ranch Split A-Frame Total <$99,000 18 6 19 12 55 > $99,000 12 14 16 3 45 Total30 20 35 15 100

  42. Example: Finger Lakes Homes • Insights Gained from the Preceding 2 Way table • The greatest number of homes in the sample (19) are a split-level style and priced at less than or equal to $99,000. • Only three homes in the sample are an A-Frame style and priced at more than $99,000.

  43. 2 Way Tables: Row or Column Percentages • Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.

  44. Example: Finger Lakes Homes • Row Percentages Price Home Style RangeColonial Ranch Split A-Frame Total < $99,000 32.73 10.91 34.55 21.82 100 > $99,000 26.67 31.11 35.56 6.67 100 Note: row totals are actually 100.01 due to rounding.

  45. Example: Finger Lakes Homes • Column Percentages Price Home Style RangeColonial Ranch Split A-Frame < $99,000 60.00 30.00 54.29 80.00 > $99,000 40.00 70.00 45.71 20.00 Total 100 100 100 100

  46. A quick 2 way table problem The table above gives the preferences for a variety of people regarding their favorite way to consume potatoes (Yes it’s a carbohydrate extravaganza!!)  • How many boys liked baked? • How many teachers preferred chips? • How many girls were asked? • Out of the people who liked chips, how many were boys?

  47. That was fun, let’s do another one! • This one deals with probabilities. Grab your calculator and let’s rock! A person is picked at random from this sample • What is the probability the a person picked is a boy? • What is the probability the a person picked likes mashed? • What is the probability the person was a teacher who prefers baked potatoes? • What is the probability that, out of the girls, the person likes chips? • Out of the people who like chips, what is the probability the person is a boy?

  48. Scatter Diagram • A scatter diagram is a graphical presentation of the relationship between two quantitative variables. • One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. • The general pattern of the plotted points suggests the overall relationship between the variables.

  49. Example: Panthers Football Team • Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of y = Number of InterceptionsPoints Scored 1 14 3 24 2 18 1 17 3 27

  50. Example: Panthers Football Team • Scatter Diagram y 30 25 20 Number of Points Scored 15 10 5 x 0 1 0 2 3 Number of Interceptions

More Related