280 likes | 490 Views
Chapter 2. Graphical summaries of data . Definitions. A frequency distribution is a table that represents the frequency for each category. The relative frequency of a category is the frequency of the category divided by the sum of all the frequencies.
E N D
Chapter 2 Graphical summaries of data
Definitions • A frequency distribution is a table that represents the frequency for each category. • The relative frequency of a category is the frequency of the category divided by the sum of all the frequencies. • Difference between frequency and relative frequency: • The frequency of a category is the number of items in the category. • The relative frequency of a category is the proportion of items in the category.
Bar Graph • A bar graph is a graphical representation of a frequency distribution. A bar graph consists of rectangles of equal width, with one rectangle for each category. The heights of the rectangles represent the frequencies or relative frequencies of the categories.
Pareto Chart • Sometimes it is desirable to construct a bar graph in which the categories are presented in order of frequency or relative frequency, with the largest frequency or relative frequency on the left and the smallest one on the right Such a graph is called a Pareto Chart.
Side-by-side bar graphs • Sometimes we want to compare two bar graphs that have the same categories. The best way to do this is to construct both bar graphs on the same axes, putting bars that correspond to the same category next to each other. The result is called a side-by-side bar graph.
Pie Charts • A pie chart is an alternative to the bar graph for displaying relative frequency information. A pie chart is a circle. The circle is divided into sections, one for each category.
Definitions • The lower class limit (LCL) of a class is the smallest value that can appear in that class. • The upper class limit (UCL) of a class is the largest value that can appear in that class. • The class width is the difference between consecutive lower class limits.
Requirements for choosing classes • Every observation must fall into one of the classes. • The classes must not overlap (they must be mutually exclusive). • The classes must be of equal width. • There must be no gaps between classes. Even if there are no observations within a class, that class must still appear on the frequency distribution.
How to construct a frequency distribution • Decide how many classes are needed. • Compute the class width.
Choose a starting point – either the minimum data value or some convenient number slightly smaller. • Computer the lower class limits of the remaining classes by adding the class width to the previous LCL. • Determine the upper class limits by looking at data and remembering that the classes need to be mutually exclusive. • Make sure you have included the largest and the smallest observations in your classes. • Tally the observations into each class.
Choosing the number of classes: • There is no single right way to choose the number of classes. • Too many classes will produce a frequency distribution and histogram that has too much detail. • Too few classes will produce a frequency distribution and histogram that does not have enough detail. • For most data, the number of classes should be between 5 and 20.
Graphs • Histograms are related to bar graphs and are appropriate for quantitative data.
Other Graphs • Stem-and-leaf plots are a simple way to display small data sets. • Dot plots • Time-series plots
Stemplot (or Stem-and-Leaf Plot) Represents data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit)
How to Lie with statistics • Check the vertical scale • Pictographs – using pictures to compare amounts • Three-dimensional graphs and perspective
Misuse # 1- Bad Samples Voluntary response sample (or self-selected sample) one in which the respondents themselves decide whether to be included In this case, valid conclusions can be made only about the specific group of people who agree to participate.
Misuse # 2- Small Samples Conclusions should not be based on samples that are far too small. Example: Basing a school suspension rate on a sample of only three students
Misuse # 3- Graphs To correctly interpret a graph, you must analyze the numerical information given in the graph, so as not to be misled by the graph’s shape.
Misuse # 4- Pictographs Part (b) is designed to exaggerate the difference by increasing each dimension in proportion to the actual amounts of oil consumption.
Misuse # 5- Percentages Misleading or unclear percentages are sometimes used. For example, if you take 100% of a quantity, you take it all. 110% of an effort does not make sense.
Other Misuses of Statistics Loaded Questions Order of Questions Refusals Correlation & Causality Self Interest Study Precise Numbers Partial Pictures Deliberate Distortions