1 / 23

Statistics and Data (Algebraic)

Statistics and Data (Algebraic). Sec. 9.7a. Some Definitions…. Statistic – numbers associated with a data set. (when used to describe the individuals in the data set, they are called descriptive statistics ). Parameter – numbers associated with an entire population.

media
Download Presentation

Statistics and Data (Algebraic)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics and Data (Algebraic) Sec. 9.7a

  2. Some Definitions… Statistic – numbers associated with a data set (when used to describe the individuals in the data set, they are called descriptive statistics) Parameter – numbers associated with an entire population (we gather information from samples of the population, then use inferential statistics to make inferences about parameters)

  3. Some Definitions… A 1996 study reported that 33% of adolescents say there is no adult at home when they return from school. The report was based on a survey of 600 randomly selected people aged 12 to 17 years old and had a margin of error of +4%. Did the survey measure a parameter or a statistic, and what does that “margin of error” mean? The survey did not measure all adolescents in the population, so it did not measure a parameter. They sampled 600 adolescents and found a statistic… However, note that the first sentence is making an inference about all American adolescents…

  4. Some Definitions… A 1996 study reported that 33% of adolescents say there is no adult at home when they return from school. The report was based on a survey of 600 randomly selected people aged 12 to 17 years old and had a margin of error of +4%. Did the survey measure a parameter or a statistic, and what does that “margin of error” mean? Interpret the margin of error as meaning “between 29% and 37% of all American adolescents would say that there is no adult home when they return from school.”

  5. Some Definitions… What is the mathematical meaning of the word “average?”  Three possible meanings, all of them measures of center. The mean of a list of n numbers is The mean is also called the arithmetic mean, arithmetic average, or average value. EX: “The average on last week’s test was 83.4.”

  6. Some Definitions… What is the mathematical meaning of the word “average?”  Three possible meanings, all of them measures of center. The median of a list of n numbers arranged in order (either ascending or descending) is • the middle number if n is odd, and • the mean of the two middle numbers if n is even. EX: “The average test score puts you right in the middle of the class.”

  7. Some Definitions… What is the mathematical meaning of the word “average?”  Three possible meanings, all of them measures of center. The mode of a list of numbers is the number that appears most frequently in the list. EX: “The average American student starts college at age 18.” Note: A statistic is called resistant if it is not strongly affected by outliers…………………………which of our three averages would be considered resistant?

  8. Guided Practice Find the mean, median, and mode of the annual home run totals for Roger Maris’s major league career: Mean: Is this statistic resistant ?  Not really…

  9. Guided Practice Find the mean, median, and mode of the annual home run totals for Roger Maris’s major league career: To find the median, first write the data set in order: Because there are 12 numbers, we average the middle two: Median: Is this statistic resistant ? • Much more so than the mean…

  10. Guided Practice Find the mean, median, and mode of the annual home run totals for Roger Maris’s major league career: How about the mode??? This data set has no mode!!! So what’s the mode for Hank Aaron’s home run totals? (see Table 9.8 on p.764) Mode: 44 The mode is typically the least important measure of center, but it sometimes has statistical significance…

  11. Guided Practice A teacher gives a 10-point quiz and records the scores in a frequency table shown below. Find the mode, median, and mean of the data set. Score 10 9 8 7 6 5 4 3 2 1 0 Frequency 2 2 3 8 4 3 3 2 1 1 1 First, how many total scores are there? Add the frequencies  there are 30 scores To find the mode, look for the score with the highest frequency. Mode: 7

  12. Guided Practice A teacher gives a 10-point quiz and records the scores in a frequency table shown below. Find the mode, median, and mean of the data set. Score 10 9 8 7 6 5 4 3 2 1 0 Frequency 2 2 3 8 4 3 3 2 1 1 1 The median will be the mean of the 15th and 16th numbers. Count the frequencies from left to right until we come to 15. The 15th number is 7, and the 16th number is 6. Median: 6.5

  13. Guided Practice A teacher gives a 10-point quiz and records the scores in a frequency table shown below. Find the mode, median, and mean of the data set. Score 10 9 8 7 6 5 4 3 2 1 0 Frequency 2 2 3 8 4 3 3 2 1 1 1 To find the mean, multiply each number by its frequency, add the products, and divide the total by 30: Mean: 5.93

  14. Guided Practice Let’s try a problem that uses the concept of weighted mean: At a certain school, it is a policy that the final exam must count 25% of the final grade. If Sam has an 88.5 average going into the final exam, what is the minimum exam score needed to earn a 90 for the semester? Assume that an 89.5 will be rounded up to a 90 on the transcript: Sam needs to make at least a 92.5 on the final exam.

  15. The Five-Number Summary

  16. The Five-Number Summary The measures of center from last class tell part of the story, but we also need measures of spread. Range – the difference between the maximum and minimum values in a data set. Quartiles – separate a data set into fourths (just as the median separates a data set into halves) First Quartile (Q ) – the median of the lower half of the data 1 Second Quartile – the median Third Quartile (Q ) – the median of the upper half of the data 3

  17. The Five-Number Summary The measures of center from last class tell part of the story, but we also need measures of spread. Interquartile Range (IQR) – measures the spread between the first and third quartiles (comprises the middle half of the data): IQR = Q – Q 3 1 Definition: Five-Number Summary The five-number summary of a data set is the collection: {minimum, Q , median, Q , maximum} 1 3

  18. Guided Practice Find the five-number summary for the male and female life expectancies in South American nations (Table 9.12 on p.768) and compare the spreads. Males: {59.0, 60.5, 61.5, 66.7, 67.9, 68.5, 69.0, 70.3, 71.4, 71.9, 72.1, 72.6} Females: {66.2, 66.7, 67.7, 72.8, 74.3, 74.4, 74.6, 76.5, 76.6, 78.8, 79.0, 79.4} Five-Number Summaries: Males: Range: 72.6 – 59.0 = 13.6, IQR = 71.65 – 64.1 = 7.55 Females: Range: 79.4 – 66.2 = 13.2, IQR = 77.7 – 70.25 = 7.45

  19. Guided Practice Five-Number Summaries: Males: Range: 72.6 – 59.0 = 13.6, IQR = 71.65 – 64.1 = 7.55 Females: Range: 79.4 – 66.2 = 13.2, IQR = 77.7 – 70.25 = 7.45 Not only do the women live longer, but there is less variability in their life expectancies (as measured by IQR). Male life expectancy is more strongly affected by different political conditions within countries (war, crime, etc.).

  20. The shapes of distributions Of the two histograms shown below, which displays a data set with more variability? Explain your answer. (a) (b) The extreme values in (a) cause the range to be big, but the compact distribution indicate a small IQR. The data in (b) exhibit high variability.

  21. The shapes of distributions Compare the medians and means for the data displayed in the three histograms below. (a) (b) (c) Symmetric Distribution Skewed Right Distribution Skewed Left Distribution Mean = Median Mean > Median Mean < Median

  22. Guided Practice Determine the five-number summary, the range, and the IQR for the annual home run production data for Mark McGwire and Barry Bonds (Table 9.6 on p.763). McGwire { 3, 9, 9, 22, 29, 32, 32, 33, 39, 39, 42, 49, 52, 58, 65, 70 } Note: The underlined numbers are those of interest for the five-number summary. Five-Number Summary: { 3, 25.5, 36, 50.5, 70 } Range: 70 – 3 = 67 IQR: 50.5 – 25.5 = 25 No Outliers

  23. Guided Practice Determine the five-number summary, the range, and the IQR for the annual home run production data for Mark McGwire and Barry Bonds (Table 9.6 on p.763). Bonds { 16, 19, 24, 25, 25, 33, 33, 34, 34, 37, 37, 40, 42, 46, 49, 73 } Note: The underlined numbers are those of interest for the five-number summary. Five-Number Summary: { 16, 25, 34, 41, 73 } Range: 73 – 16 = 57 IQR: 41 – 25 = 16 Outlier: 73

More Related