1 / 25

Announcements

Exams returned at end of class Average = 78 Standard Dev = 12 Key with explanations will be posted Don’t be discouraged: First test is often hardest. Have been focused on categorical data & proportions Next segment of course will focus on numerical data & means

Download Presentation

Announcements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exams returned at end of class Average = 78 Standard Dev = 12 Key with explanations will be posted Don’t be discouraged: First test is often hardest Have been focused on categorical data & proportions Next segment of course will focus on numerical data & means Today we discuss summary stats and graphs for numerical data Announcements

  2. Numerical Data • Numerical data can be continuous or discrete • Discrete data is restricted by its nature to certain values, usually counts • Continuous data could conceptually be measured to more and more decimal places

  3. Examples of Discrete Data • Number of people • Litter size for animal births • Number of days with rain

  4. Example of Continuous Data • Temperature (not just 87, but 87.3 degrees) • Time (not just 10 seconds, but 10.58 sec) • Weight (not just 5 lbs, but 5.3 lbs)

  5. Summaries of Numerical Data • Numerical data is summarized by a measure of “center” and a measure of “spread” • There are two pairs of these measures • Mean (center) and standard deviation (spread) • Median (center) and SIQR (spread) • SIQR = Semi-interquartile range

  6. Mean and Standard Deviation • The mean is the average. To compute the mean, add up all the values and divide by the number of observations. • The standard deviation is a measure of spread. To compute it, subtract the mean from each value (called deviations). Square the deviations, total them, divide by n-1 and take the square root.

  7. Example 1 • Observations: 50, 63, 72, 84, 91 • Mean = (50+63+72+84+91)/5 = 72 • Deviations = (50-72) = -22, … , (91 -72) = 19 • Deviations Squared = 484, … , 361 • Total of above = 484 + … + 361 = 1070 • Total/(5-1) = Total/4 = 267.5 • Square root of 267.5 = 16.35 • Standard Deviation = 16.35

  8. Example 2 • Observations: 69, 71, 72, 72, 76 • Mean = (69+71+72+72+76)/5 = 72 • Standard Deviation = 2.54 • This data set has the same mean as example 1, but less variability. Thus, it has a lower “spread” or standard deviation.

  9. Median and SIQR • The median is the middle of the sorted data. One half of the data is higher than the median and one half is below. • SIQR = (upper quartile - lower quartile)/2 • The lower quartile is the value so that one fourth of the data is below it and three fourths of the data is above it. • The upper quartile is the value so that three fourths of the data is below it and one fourth of the data is above it.

  10. Example 1 revisited • Observations: 50, 63, 72, 84, 91 • Median = 72 • Lower Quartile = 63 • Upper Quartile = 84 • SIQR = (84 - 63)/2 = 10.5

  11. Example 2 revisited • Observations: 69,71,72,72,76 • Median = 72 • Lower Quartile = 71 • Upper Quartile = 72 • SIQR = (72-71)/2 = .5 • Again, the two data sets have the same “center” but different “spreads”

  12. Mean and SD Sensitive to outliers Sampling distributions are easily found Median and SIQR Robust to outliers Sampling distributions are difficult to find Making the comparison Therefore, we will use the mean and standard deviation for “well behaved” data and we will use the median and SIQR when we have outliers.

  13. Sensitivity vs. Robustness • Observations: 50, 63, 72, 84, 91 • Mean = 72, SD = 16.35 • Median = 72, SIQR = 10.5 • New Observation = 24 • New Mean = 64, New SD = 24.45 • New Median = 67.5, New SIQR = 13.875 • The mean and SD were more heavily affected by the outlier than the median and SIQR.

  14. Sampling Distributions • As we move forward, we will see that the sample mean is normally distributed, and that the t-distribution can help describe the sample mean and sample standard deviation • Finding the sampling distributions for the sample median and SIQR is more involved, and will not be covered in this course.

  15. Summary Graphs • Stem-and-leaf chart • Histogram • Box plot

  16. 9 | 5677 9 | 001123444 8 | 566666778889 8 | 00001112222233444 7 | 55668999999 7 | 0011223344 6 | 567777777788899 6 | 000123334 5 | 6799 5 | 14 4 | 8 The stems are the first digit of the grade and placed to the left of the line The leaves are the second digit of the grade and placed to the right of the line Each grade is represented Example: There are three 81’s Exam Grades: Stem-and-leaf plot

  17. Histogram: Section 506 • Histogram is a bar chart • More aesthetic than a stem-and-leaf • Cannot reconstruct the data set from a histogram

  18. Box-plots • Useful for comparing groups • Center line is median • Top of box is upper quartile • Bottom of box is lower quartile Max Max Upper Q. Upper Q. Median Median Lower Q Lower Q Min Min

  19. More On Boxplots • Same data sets as before, but a zero was added to each • Outliers are represented as points • Definition of outlier is based on the quartiles and the SIQR Max Max Upper Q. Upper Q. Median Median Lower Q Lower Q Lowest Non-outlier Lowest Non-outlier Min Min

  20. Why I don’t curve Low scores indicate a problem to be addressed: learning is not happening Curving does not encourage learning, it is a cheap fix for low grades What I do instead Sometimes I offer exam corrections Other times I offer additional bonus assignments This time: a bonus assignment will be offered Grades and Curving

  21. John In my Fall 99 class First Exam: D Good HW & Quiz Made office visits Grades improved Class grade: A I did not curve Sarah In my Fall 99 class First Exam: D Skipped HW & Quiz Never came by office Class grade: F Whined I did not curve A Tale of Two Aggies

  22. Bonus E: Election Coverage • Give a statistical critique of election coverage of next week’s debate • If you can’t watch debate, you may use a magazine or newspaper (include copy) • Clarity: 2 points • Validity: 2 points • Brevity: 2 points • Typed on paper: due Oct. 24

  23. How to make a stem-and-leaf • Click the Editor button • Enter data in columns • Click Close button • Go to Graphs: One Variable: Stem-and-Leaf • Select the variable of interest • Click OK

  24. How to make a histogram • After entering data, go to Graphs: One Variable: Histogram: Continuous Variable • Select variable of interest • Set desired options • Click OK

  25. How to make box-plots • Go to Graphs: Comparison of Variables: Box Plot Comparison • Select all variables of interest (makes side-by-side box plots) • Click OK.

More Related