160 likes | 322 Views
Data Analysis and Reporting. It’s an outliar !. Histogram. Similar to a bar graph but u ses data that is measured. . Stem-and-Leaf Displays. Show the distribution of a quantitative variable, like histograms do, while preserving the individual values. Stem-and-Leaf Display.
E N D
Data Analysis and Reporting It’s an outliar!
Histogram • Similar to a bar graph but uses data that is measured.
Stem-and-Leaf Displays • Show the distribution of a quantitative variable, like histograms do, while preserving the individual values.
Stem-and-Leaf Display • How to Construct: • First, cut each data value into leading digits (“stems”) and trailing digits (“leaves”). • Use only one digit for each leaf. • Either round or truncate the data values to one decimal place after the stem.
Dotplots • A dotplotis a simple display. It just places a dot along an axis for each case in the data. • You might see a dotplot displayed horizontally or vertically.
Shape of Data • One single central peak or several separated peaks? • The peaks are called modes. • One peak isUnimodal. • Two peaks is called Bimodal. • More than two peaks is called Multimodal. • Straight across is called uniform.
Shape of Data • If the histogram can be folded vertically in the middle and have the edges match pretty closely, the histogram is symmetric.
Shape of Data • The (usually) thinner ends of a distribution are called the tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail. • The skew is the direction of the tail. • Skewed Left Skewed Right
Shape of Data • You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. • Are there any gaps in the distribution? If so, we might have data from more than one group.
Shape • Always report a measure of spread along with a measure of center when describing a distribution numerically. • The range of the data is the difference between the maximum and minimum values: Range = max – min
Box and Whiskers • The median is the value with exactly half the data values below it and half above it. • The median divides the data into two equal areas. • Use the median as a measure of center when data is skewed. • We find the mean by adding up all of the data values and dividing by n, the number of data values we have. • Use the mean as a measure of center when the data is symmetric.
Box and Whiskers • The interquartile range (IQR) lets us ignore extreme data values and concentrate on the middle of the data. • The lower and upper quartiles are the 25th and 75thpercentiles of the data, so… • The IQR contains the middle 50% of the values of the data.
Box and Whiskers • The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). • Box and Whiskers is a graphical display of the 5-number summary, and sometimes outliers, on a number line.
Box and Whiskers • Follow the following steps to test for outliers: • Step 1: Multiply the IQR by 1.5 • Step 2: Add this value above to Q3 • Step 3: Subtract the value in step 1 from Q1 • Anything above the value in Step 2 is an outlier • Anything below the value in Step 3 is an outlier. • Display outliers as asterisks, *.