1 / 30

Exploring Data

Exploring Data. Graphing and Summarizing Univariate Data. Graphing the Data. Graphical displays of quantitative data include: Dotplot Stemplot Histogram Cumulative Frequency Plots (ogives) Boxplots. Dotplot. As you might guess, a dotplot is made up of dots plotted on a graph.

reid
Download Presentation

Exploring Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ExploringData Graphing and Summarizing Univariate Data

  2. Graphing the Data • Graphical displays of quantitative data include: • Dotplot • Stemplot • Histogram • Cumulative Frequency Plots (ogives) • Boxplots

  3. Dotplot • As you might guess, a dotplot is made up of dots plotted on a graph. • Each dot can represent a single observation from a set of data, or a specified number of observations from a set of data. • The dots are stacked in a column over a category or value, so that the height of the column represents the frequency of observations in the category.

  4. Dotplot Example Number of Dogs in Each Home in My Block * * * * * * * * * * 0 1 2 3 # of Dogs

  5. Stemplot Stems Leaves 15 1 14 13 12 2 6 11 4 5 7 9 10 1 2 2 2 5 7 9 9Key: 9 0 2 3 4 4 5 7 8 9 9 15 1 = 151 8 1 1 4 7 8 Key: 110 7 represents an IQ score of 117

  6. Histogram Note bars touch and variable is quantitative

  7. Cumulative Frequency Plot Typical Wait Times Often Used for estimating medians, quartiles, & Percentiles Cum Freq (%) Wait Times ( in Hrs.)

  8. Boxplot Med Max Based on 5- Number Summary

  9. SHAPES of Boxplots • Previous was symmetric • Below is Skewed left • Below is Skewed Right

  10. Checking for outliers An outlier is any value that is either • greater than Q3 + 1.5*IQR OR • less than Q1 – 1.5*IQR Note that whiskers always end at a data value

  11. What Is Required on ALL Plots? • Title • Labels on the horizontal and vertical axes - be sure if you are using 3 to represent 3,000 that that information is in the label • Scales on both axes (sometimes this is not needed, for example on boxplots) • Labels for each plot if the graph includes multiple data sets (e.g. parallel boxplots)

  12. How to Describe the Graphs Use your SOCS: • S hape • O utliers and/or other unusual features • C enter • S pread Discuss all characteristics IN CONTEXT.

  13. Shape • Four Basic Shapes: • Symmetric • Uniform

  14. Skewed left or skewed toward small values • Skewed right or skewed toward large values

  15. Should I Say Normal? Be careful when you describe the shape of a mound-shaped, approximately symmetric distribution. The distribution may or may not be normal. Graders will accept the description as approximately normal, but they will not accept that the distribution isnormal based only on a mound-shaped, symmetric graph.

  16. Outliers and other Unusual Features The Usual Unusuals: • Gaps • Clusters • Outliers • Peaks – ex. Bimodal

  17. Center • Mean and median are both measures of center • Median – put the values in order and the median is the middle value (or the mean of the two middle values) – the median divides a histogram into two equal areas • Mean – add the values and divide by the number of values you have – the mean is the balance point for a histogram

  18. Spread Several ways to describe: • Range – calculate max - min; the range gives you the total spread in the data. • IQR – calculate Q3 – Q1; IQR gives you the spread of the middle 50% of the data • Standard deviation – the average distance of data values from the mean

  19. How Does the shape impact Mean and Median? • If the shape is approximately symmetric, the mean and median are approximately equal. • If the shape is skewed, the mean is closer to the tail than the median. Ex. Salaries – the mean will be larger than the median because salaries are usually skewed right

  20. The Converse May Not Be True Be careful – If the mean is not equal to the median, you cannot conclude automatically that the shape is skewed.

  21. Comparing Graphs Means to Compare – not just list characteristics • Okay to say • The mean of x= 8 is less than the mean of y = 9. • The medians of x and y are about the same. • The median of x is slightly larger. • The shapes are both skewed left. • Not Okay • The mean of x is 8 and the mean of y is 9. • Median x = 4, median y =4. • The shapes are similar.

  22. When Do You Use X-Bar/Sx and When Do You Use the 5-Number Summary? • If the distribution is symmetric, use mean and standard deviation. • If the distribution is skewed, use the 5-number summary. • Note that the mean and standard deviation are not resistant to outliers; the median and IQR are resistant.

  23. Other Key Locations on Distributions • Percentile – the smallest value x for which n percent of the data values are < or = x ex. If the 80th percentile is 28, then 80% of the data equal 28 or less • Quartiles – the 25th, 50th, 75th percentiles. The 25th percentile is the lower or first quartile Q1, the 50th percentile is the median, the 75th percentile is the upper or third quartile Q3. • Z-score – shows how many standard deviations a value is above or below the mean

  24. How do I get the summary values? • You can calculate most of the summary values using 1-Var Stats. • The order on the calculator is: 1-Var Stats L1 or 1-Var Stats L1, L2 The data values are in L1 and the frequencies are in L2

  25. Categorical Data Displays

  26. Frequency Tables Grades Earned on Test 1 Grade frequency A 10 B 15 C 5 D 2 F 1

  27. Bar Chart

  28. Segmented Bar Chart Hobbies By Gender

  29. Two Way Tables Favorite Leisure Activities Dance Sports TV Total Men 2 10 8 20 Women 16 6 8 30 Total 18 16 16 50

  30. One Other Graph – The Pie Chart Sorry – couldn’t resist GOOD LUCK ON THE EXAM!!!

More Related