1 / 29

Chapter-2: Descriptive Statistics:

Chapter-2: Descriptive Statistics:. Tabular and Graphical Presentations Part A. Exploratory Data Analysis Cross-tabulations; Scatter Diagrams. Tabular and Graphical Procedures. Data. Qualitative Data. Quantitative Data. Tabular Methods. Graphical Methods. Tabular Methods.

oihane
Download Presentation

Chapter-2: Descriptive Statistics:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter-2: Descriptive Statistics: Tabular and Graphical PresentationsPart A Exploratory Data Analysis Cross-tabulations; Scatter Diagrams

  2. Tabular and Graphical Procedures Data Qualitative Data Quantitative Data Tabular Methods Graphical Methods Tabular Methods Graphical Methods • Bar Graph • Pie Chart • Frequency • Distribution • Rel. Freq. Dist. • Percent Freq. • Distribution • Crosstabulation • Dot Plot • Histogram • Ogive • Scatter • Diagram • Frequency • Distribution • Rel. Freq. Dist. • Cum. Freq. Dist. • Cum. Rel. Freq. • Distribution • Stem-and-Leaf • Display • Crosstabulation

  3. Stem-and-Leaf Display • A stem-and-leaf display is an exploratory data analysis method that shows both the rank order • and shape of the distribution of the data. • It is similar to a histogram, but it has the • advantage of showing the actual data values.

  4. Example: Hudson Auto Repair • Sample of Parts Cost for 50 Tune-ups

  5. Stem-and-Leaf Display 5 6 7 8 9 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9

  6. Steps to Construct a Stem-and-Leaf Display • Arrange the first digits of each data to the • left of a vertical line. • To the right of the vertical line, record the last • digit for each item in rank order. • Each line in the display is referred to as a stem. • Each digit on a stem is a leaf.

  7. Stem-and-Leaf Display 5 6 7 8 9 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9 a stem a leaf

  8. Stretched Stem-and-Leaf Display • If we believe the display has condensed the data too much, we can stretch thedisplay by using two stems for each leading digit(s). • Whenever stretched, the first value of a stem corresponds to leaf values of 0 - 4, and the second • value of a stem corresponds to leaf values of 5 - 9.

  9. Stretched Stem-and-Leaf Display 5 5 6 6 7 7 8 8 9 9 10 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9

  10. Stem-and-Leaf Display—Some notes • Leaf Units • A single digit is used to define each leaf. • In the preceding example, the leaf unit was 1. • Leaf units may be 100, 10, 1, 0.1, and so on. • Where the leaf unit is not shown, it is assumed • to equal 1.

  11. Example: Leaf Unit = 0.1 If we have data with values such as 8.6 11.7 9.4 9.1 10.2 11.0 8.8 a stem-and-leaf display of these data will be Leaf Unit = 0.1 8 9 10 11 6 8 1 4 2 0 7

  12. Example: Leaf Unit = 10 If we have data with values such as 1806 1717 1974 1791 1682 1910 1838 a stem-and-leaf display of these data will be Leaf Unit = 10 16 17 18 19 8 The 82 in 1682 is rounded down to 80 and is represented as an 8. 1 9 0 3 1 7

  13. Crosstabulations and Scatter Diagrams

  14. Crosstabulations and Scatter Diagrams • Thus far we have focused on methods that are used • to summarize the data on one variable at a time. • Cross tabulationand a scatter diagram are methods for • summarizing the data on two (or more) variables • simultaneously. • Help us understand the relationship between two variables.

  15. Crosstabulation • A crosstabulation is a tabular summary of data for two variables. • Crosstabulation can be used when: • one variable is qualitative and the other is • quantitative, • both variables are qualitative, or • both variables are quantitative. • The left and top margin labels define the classes for • the two variables.

  16. Cross-tabulation • Example: The number of Homes sold by style and price for the past two years is shown below. qualitative variable quantitative variable Home Style Price Range Colonial Log Split A-Frame Total 18 6 19 12 55 45 < $99,000 > $99,000 12 14 16 3 30 20 35 15 Total 100

  17. Cross-tabulation • Insights Gained from Preceding Cross-tabulation • The greatest number of homes in the sample (19) • are a split-level style and priced at less than or • equal to $99,000. • Only three homes in the sample are an A-Frame • style and priced at more than $99,000.

  18. Cross-tabulation: Row or Column Percentages • Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.

  19. Cross-tabulation Frequency distribution for the price variable Home Style Price Range Colonial Log Split A-Frame Total 18 6 19 12 55 45 < $99,000 > $99,000 12 14 16 3 30 20 35 15 Total 100 Frequency distribution for the home style variable

  20. Cross-tabulation: Row Percentages Home Style Price Range Colonial Log Split A-Frame Total 32.73 10.91 34.55 21.82 100 100 < $99,000 > $99,000 26.67 31.11 35.56 6.67 (Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

  21. Simpson’s Paradox • Often Times data in two or more cross-tabulations are aggregated to produce a summary cross-tabulation. • We must be careful in drawing conclusions about the • relationship between the two variables in the • aggregated cross-tabulation. • Simpson’ Paradox: In some cases the conclusions based upon an aggregated cross-tabulation can be completely different from the un-aggregated data.

  22. Scatter Diagram and Trend line

  23. Scatter Diagram and Trend line • A scatter diagram is a graphical presentation of the • relationship between two quantitative variables. • One variable is shown on the horizontal axis and the • other variable is shown on the vertical axis. • The general pattern of the plotted points suggests the • overall relationship between the variables. • A trend line is an approximation of the relationship.

  24. Scatter Diagram • A Positive Relationship y x

  25. Scatter Diagram • A Negative Relationship y x

  26. Scatter Diagram • No Apparent Relationship y x

  27. Example: Panthers Football Team • The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of Interceptions y = Number of Points Scored 14 24 18 17 30 1 3 2 1 3

  28. 35 30 25 20 15 10 5 0 1 0 2 3 4 Scatter Diagram y Number of Points Scored x Number of Interceptions

  29. Example: Panthers Football Team • Insights Gained from the Preceding Scatter Diagram • The scatter diagram indicates a positive relationship • between the number of interceptions and the • number of points scored. • Higher points scored are associated with a higher • number of interceptions. • The relationship is not perfect; all plotted points in • the scatter diagram are not on a straight line.

More Related