550 likes | 636 Views
Breakout Session #1 Graphical Statistics. Presented by Dr. Del Ferster. What’s in store for today?. We’ll start by doing a needs assessment. Where do you want or need more information regarding the topics for this year’s work. We’ll spend a bit of time looking at some “test-type” problems.
E N D
Breakout Session #1Graphical Statistics Presented by Dr. Del Ferster
What’s in store for today? • We’ll start by doing a needs assessment. • Where do you want or need more information regarding the topics for this year’s work. • We’ll spend a bit of time looking at some “test-type” problems. • We’ll take another look at graphical statistics. • Different types of plots • 2 column frequency tables. • I know, it sounds like a blast!
Let’s do some problems! Practice Problems GRAPHICAL STATISTICS
Get your popcorn ready! TODAY’S FEATURE PERFORMANCE GRAPHICAL STATISTICS
Descriptive Statistics:Tabular and Graphical Presentations • Summarizing Qualitative Data • Summarizing Quantitative Data • Recall • Qualitative = Essentially just a name. • Quantitative = True numerical data.
We Deal with 2 Types of Data • Numerical/Quantitative Data [Real Numbers]: • Your height • The number of people in your family • temperature of coffee bought at McDonalds • The score on your last math test • Qualitative/Categorical Data [Labels rather than numbers]: • grade of a High School student[F, S, J, Senior] • favorite color • Political party affiliation • the part of a new automobile that breaks first • the reason you get mad at your spouse
Summarizing Qualitative Data • Frequency Distribution (shows how many) • Relative Frequency Distribution (shows what fraction) • Percent Frequency Distribution (shows what percentage) • Bar Graph • Pie Chart • Both these are graphical means for displaying any of above.
Frequency Distribution • A frequency distributionis a tabular summary of data showing the frequency (or number) of items in each of several non-overlapping classes. • The objective is to provide insightsabout the data that cannot be quickly obtained by looking only at the original data.
Example: Stumble Inn Guests staying at Stumble Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 guests are shown below.
Example: Stumble Inn • Frequency Distribution RatingFrequency Poor 2 Below Average 3 Average 5 Above Average 9 Excellent 1 Total 20
Relative Frequency Distribution • The relative frequencyof a class is the fraction or proportion of the total number of data items belonging to the class. • A relative frequency distributionis a tabular summary of a set of data showing the relative frequency for each class.
Percent Frequency Distribution • The percent frequencyof a class is the relative frequency multiplied by 100. • Apercent frequency distributionis a tabular summary of a set of data showing the percent frequency for each class.
Example: Stumble Inn • Relative Frequency and Percent Frequency Distributions Relative Percent RatingFrequency Frequency Poor .10 10 Below Average .15 15 Average .25 25 Above Average .45 45 Excellent .05 5 Total 1.00100
Bar Graph • A bar graphis a graphical device for depicting qualitative data. • On the horizontal axis we specify the labels that are used for each of the classes. • A frequency, relative frequency, or percent frequencyscale can be used for the vertical axis. • Using a bar of fixed widthdrawn above each class label, we extend the height appropriately. • The bars are separatedto emphasize the fact that each class is a separate category.
9 8 7 6 Frequency 5 4 3 2 1 Rating Above Average Excellent Poor Below Average Average Example: Stumble Inn • Bar Graph
Pie Chart • The pie chartis a commonly used graphical device for presenting relative frequency distributions for qualitative data. • First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class. • Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.
Exc. 5% Poor 10% Below Average 15% Above Average 45% Average 25% QualityRatings Example: Stumble Inn • Pie Chart
Example: Stumble Inn • Insights Gained from the Preceding Pie Chart • One-half of the customers surveyed gave Stumble Inn a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might please the manager. • For each customer who gave an “excellent” rating, there were two customers who gave a “poor” rating (looking at the top of the pie). This should displease the manager.
Summarizing Quantitative Data • Frequency Distribution • Relative Frequency and Percent Frequency Distributions • Dot Plot • Histogram • Cumulative Distributions • Ogive
Example: RPM Auto Repair The manager of RPM Auto Repair would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. He examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.
Example: RPM Auto Repair • Sample of Parts Cost for 50 Tune-ups • Including a line in the table for every possible cost is not a good idea. • We need to categorize the data.
Frequency Distribution • Guidelines for Selecting Number of Classes • Use between 5 and 20 classes • Smaller data sets usually require fewer classes • Data sets with a larger number of elements usually require a larger number of classes. • Note that the upper limit of every class is also the lower limit of the next class. • We treat the upper limit as OPEN (or Up to that amount)
Frequency Distribution • Guidelines for Selecting Width of Classes • Use classes of equal width. • Approximate Class Width =
Frequency Distribution • For RPM Auto Repair, if we choose 6 classes: Approximate Class Width = 50-60 60-70 70-80 80-90 90-100 100-110 2 13 16 7 7 5 Total 50 Parts Cost ($) Frequency
Relative Frequency andPercent Frequency Distributions Relative Frequency Percent Frequency Parts Cost ($) 50-60 60-70 70-80 80-90 90-100 100-110 .04 .26 .32 .14 .14 .10 Total 1.00 4 26 32 14 14 10 100 2/50 .04(100)
Relative Frequency andPercent Frequency Distributions • For the RPM Motors Data, we can make the following observations. • Only 4% of the parts costs are in the $50-60 class. • 30% of the parts costs are under $70. • The greatest percentage (32% or almost one-third) of the parts costs are in the $70-80 class. • 10% of the parts costs are $100 or more.
Dot Plot • One of the simplest graphical summaries of data is a dot plot. • A horizontal axis shows the range of data values. • Then each data value is represented by a dot placed above the axis.
. . .. . . . . .. .. .. .. . . . . . ..... .......... .. . .. . . ... . .. . 5060708090100110 Cost ($) Example: RPM Auto Repair • Dot Plot
Histogram • Another common graphical presentation of quantitative data is a histogram. • The variable of interest is placed on the horizontal axis. • A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, relative frequency, or percent frequency. • Unlike a bar graph, a histogram has no natural separation between rectanglesof adjacent classes.
Example: Hudson Auto Repair • Histogram 18 16 14 12 Frequency 10 8 6 4 2 Parts Cost ($) 50 60 70 80 90 100 110
Cumulative Distributions • Cumulative frequency distribution -- shows the number of items with values less than or equal to the upper limit of each class. • Cumulative relative frequency distribution -- shows the proportion of items with values less than or equal to the upper limit of each class. • Cumulative percent frequency distribution -- shows the percentage of items with values less than or equal to the upper limit of each class.
Example: Hudson Auto Repair • Cumulative Distributions Cumulative Cumulative Cumulative Relative Percent Cost ($)FrequencyFrequencyFrequency < 60 2 .04 4 < 70 15 .30 30 < 80 31 .62 62 < 90 38 .76 76 < 100 45 .90 90 <110 50 1.00 100
Exploratory Data Analysis • The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. • One such technique is the stem-and-leaf display.
Stem-and-Leaf Display • A stem-and-leaf display shows both the rank orderand shape of the distributionof the data. • It is similar to a histogramon its side, but it has the advantage of showing the actual data values. • The first digits of each data item are arranged to the left of a vertical line. • To the right of the vertical line we record the last digit for each item in rank order. • Each line in the display is referred to as a stem. • Each digit on a stem is a leaf. 8 5 7 9 3 6 7 8
Stem-and-Leaf Display • Leaf Units • A single digit is used to define each leaf. • In the preceding example, the leaf unit was 1. • Leaf units may be 100, 10, 1, 0.1, and so on. • Where the leaf unit is not shown, it is assumed to equal 1.
Example: Hudson Auto Repair • Stem-and-Leaf Display 5 2 7 6 2 2 2 2 5 6 7 8 8 8 9 9 9 7 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 8 0 0 2 3 5 8 9 9 1 3 7 7 7 8 9 10 1 4 5 5 9
SPLIT STEM Stem-and-Leaf Display • If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the displayby using two more stems for each leading digit(s). • Whenever a stem value is stated twice, the first value corresponds to leaf values of 0-4, and the second values corresponds to values of 5-9.
Example: Hudson Auto Repair • SPLIT STEM Stem and Leaf Plot 5 2 5 7 6 2 2 2 2 6 5 6 7 8 8 8 9 9 9 7 1 1 2 2 3 4 4 7 5 5 5 6 7 8 9 9 9 8 0 0 2 3 8 5 8 9 9 1 3 9 7 7 7 8 9 10 1 4 10 5 5 9
2 Way Data Tables • Thus far we have focused on methods that are used to summarize the data for one variable at a time. • Often we are interested in tabular and graphical methods that will help understand the relationship between two variables. • 2 Way Data Tables and scatter diagramsare two methods for summarizing the data for two (or more) variables simultaneously.
2 Way Data Tables • 2 way data tables are used to summarize the data for two variables simultaneously. • 2 way data tables can be used when: • One variable is qualitative and the other is quantitative • Both variables are qualitative • Both variables are quantitative • The left and top margin labels define the classes for the two variables.
Example: Finger Lakes Homes • 2 Way Data Tables The number of Finger Lakes homes sold for each style and price for the past two years is shown below. Price Home Style RangeColonial Ranch Split A-Frame Total <$99,000 18 6 19 12 55 > $99,000 12 14 16 3 45 Total30 20 35 15 100
Example: Finger Lakes Homes • Insights Gained from the Preceding 2 Way table • The greatest number of homes in the sample (19) are a split-level style and priced at less than or equal to $99,000. • Only three homes in the sample are an A-Frame style and priced at more than $99,000.
2 Way Tables: Row or Column Percentages • Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.
Example: Finger Lakes Homes • Row Percentages Price Home Style RangeColonial Ranch Split A-Frame Total < $99,000 32.73 10.91 34.55 21.82 100 > $99,000 26.67 31.11 35.56 6.67 100 Note: row totals are actually 100.01 due to rounding.
Example: Finger Lakes Homes • Column Percentages Price Home Style RangeColonial Ranch Split A-Frame < $99,000 60.00 30.00 54.29 80.00 > $99,000 40.00 70.00 45.71 20.00 Total 100 100 100 100
A quick 2 way table problem The table above gives the preferences for a variety of people regarding their favorite way to consume potatoes (Yes it’s a carbohydrate extravaganza!!) • How many boys liked baked? • How many teachers preferred chips? • How many girls were asked? • Out of the people who liked chips, how many were boys?
That was fun, let’s do another one! • This one deals with probabilities. Grab your calculator and let’s rock! A person is picked at random from this sample • What is the probability the a person picked is a boy? • What is the probability the a person picked likes mashed? • What is the probability the person was a teacher who prefers baked potatoes? • What is the probability that, out of the girls, the person likes chips? • Out of the people who like chips, what is the probability the person is a boy?
Scatter Diagram • A scatter diagram is a graphical presentation of the relationship between two quantitative variables. • One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. • The general pattern of the plotted points suggests the overall relationship between the variables.
Example: Panthers Football Team • Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of y = Number of InterceptionsPoints Scored 1 14 3 24 2 18 1 17 3 27
Example: Panthers Football Team • Scatter Diagram y 30 25 20 Number of Points Scored 15 10 5 x 0 1 0 2 3 Number of Interceptions