1 / 27

Exploring Data

Exploring Data. Edward Son and Michael Kim. Categorical Pie Bar Graph Quantitative Stem Plot Histogram Time Plot Ogive. Pie Charts. - Must include all the categories that make up a whole - Use When you want to emphasize each category’s relation to the whole. . Bar Graph .

norah
Download Presentation

Exploring Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploring Data Edward Son and Michael Kim

  2. Categorical • Pie • Bar Graph • Quantitative • Stem Plot • Histogram • Time Plot • Ogive

  3. Pie Charts - Must include all the categories that make up a whole - Use When you want to emphasize each category’s relation to the whole.

  4. Bar Graph • Easy to make, easy to read • More flexible than pie charts • A graph drawn using rectangular bars to show how large each value is. 

  5. Stem Plot • Gives quick picture of the shape of a distribution while including actual numerical values • Making a stem plot - Separate each observation into a stem, consisting of all but the final digit, and a leaf, the final digit. - Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column - Write each leaf in the row to the right of its stem, in increasing order out from the stem. Side-By-Side Stem Plot

  6. Side-By-Side Stem Plot • Can double the number of stems in a plot by splitting stems into two • Can trim the numbers by removing the last digit or digits before making a stem plot is often the best

  7. Histogram • Breaks the range of values of a variable into classes and displays only the count or percent of the observations that fall into each classes

  8. Time Plot • A time plot of a variable plots each observation against the time at which it was measured. Always put time on the horizontal scale of your plot and the variable you are measuring on the vertical scale. • Displays of distribution that includes time.

  9. Relative Frequency and Cumulative Frequency • Ogive Graph

  10. Histograms vs Bar Graphs • Shows counts or percents of a quantitative variable. • Bars are connected • Displays categorical variable. • Bars are separated

  11. Examining a Distribution • Use SOCS • S – Shape • O – Outlier • C – Center • S - Spread

  12. Mode • S – Shape • Mode = Peaks (One major peak called Unimodal) • Skewedness • Symmetric Skewed right Symmetric

  13. O- Outliers • An important kind of deviation is an outlier, an individual value that falls outside the overall pattern Outliers Outliers

  14. C – Center • Describe the center of a distribution by its midpoint • Ex) mean, median (more information later)

  15. S – Spread • Can describe by giving the smallest and largest values

  16. Five Number Summary • Minimum • Q1 • Median • Q3 • Maximum

  17. Quartiles • First Quartile Q1 • Third Quartile Q3 • Interquartile Range = Q3 – Q1

  18. 1.5 x IQR Rule • Use 1.5 x IQR Rule for suspected Outliers

  19. Median • It is the midpoint of a distribution 1 1 4 1 2 7 3 10 4 1 5 6 6 7 7 7 3 7 7 8 9 4 2 1 1 1 2 2 3 4 4 4 5 6 6 7 7 7 7 7 7 8 9 10 3 Median

  20. Mean • To find the mean of observations, add their values and divide by the number of observations x

  21. Standard Deviation • The variance s^2 of a set of observations is the average of the squares of the deviations of the observations from their mean.

  22. Shape: Slightly skewed left with on peak (unimodal). • Outliers: 1.5 x IQR so 1.5 x17=25.5 - Q1= 35-1.5 x 25.5 = -3.25 - Q3= 54+1.5 x 25.5 = 92.75 There fore there are no outliers. • Center: Approximately at 43.93 • Spread: Babe Ruth’s number of home runs in a single season varies from a low of 23 to a high of 60.

  23. Insert data in L1 in stat edit • Calculate 1 var stat or do it all in hand • Mean = 23.176 • Standard deviation = 18.639 • Median = 22

  24. median Q3 Q1 Minimum value Maximum value

  25. 81 – 11 = 70 (45+47)/2 = 26

  26. {1, 3, 5, 7} Calculate s.d

More Related