620 likes | 635 Views
Chapter 1: Exploring Data. AP Stats, 2009-2010. Questionnaire.
E N D
Chapter 1: Exploring Data AP Stats, 2009-2010
Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my doctoral dissertation, which is on characteristics of American private school students. After you complete the questionnaire, please return it in the enclosed SASE. Thank you for your participation.”
Definitions, pp. 4-6 • Individuals and variables • On questionnaire, which are individuals and variables? • Categorical and quantitative • Which variables are categorical and which are quantitative? • Distribution
Practice • From Questionnaire: • #12 (dot plot) • #13 (pie chart) • #3 (bar chart)
Practice • Exercises 1.1, 1.2, 1.4, p. 7.
A really nice bar graph (I made this myself, so just nod approvingly):
Interpreting the Dotplot • Shape, center, spread • Look for overall patterns and striking deviations from that pattern. • Outliers • Individual observation(s) that falls outside the overall pattern on a graph of a distribution. • In the next section, we will learn a mathematical rule of thumb for deeming an observation an outlier. For now, we’ll just talk in general terms.
Stemplots • Sometimes called “stem and leaf plots.” • Useful when there are a lot of data points, or the range of values is large.
Dotplot? • What would a dotplot look like for these data?
How to create a stemplot • Example 1.5, p. 13 • Rules of thumb: • Choosing the number of stems: • No magic number, but a minimum of 5 is good. Too few stems will result in a skyscraper effect, too many make a pancake graph. • 10 is a good starting point. • For data points with decimals, round the data so that the final digit after rounding is suitable as a leaf. • Let’s try one: • Exercise 1.8, p. 17
Stemplot for 1.8 (StatCrunch) Stem and Leaf Variable: MPG2 : 113444444 2 : 5556678888888888999 3 : 0002
Another Exercise • 1.9, p. 17
Homework • Reading: Section 1.1 through p. 30. • Exercises: 1.10 and 1.11 (pp. 17-18)
Histograms • The most common way to display the distribution of a quantitative variable. • How to make a histogram: • Example 1.6, p. 19 • Read the interpretation of this graph, p. 20. • Choose between 6 and 15 classes (bars on your graph)
Notice y-axis: number of values in a particular class. Notice the x-axis: It is the variable of interest. Figure 1.7, p. 20
Practice Problem • 1.14, p. 23
How to make a histogram with your calculator • Technology Toolbox, p. 21 • Just enter raw data in L1, then construct a histogram. • Read this carefully tonight. • Another way: • Summarize the data and put in, say, L2. Put the midpoint of the class the data are in in L1.
Exercise 1.14, p. 23 • Now, make a stemplot for these data. • Which do you prefer?
Homework • Reading: Through p. 34 • Pay careful attention to Example 1.8, pp. 28-30: how to create an ogive. • Exercise: • 1.12 (p. 22)
Practice Problems • Exercises: • 1.13 (p. 23) • 1.17 (p. 27)
Percentile • Would you rather score at the 70th or 95th percentile on the SATs? • If you scored at the 95th percentile, what does that mean?
Ogive • Probably my favorite word to say in statistics. Let’s practice saying it … • Used when we would like to see the relative standing of an individual observation. • Does a histogram give us this?
Example 1.8, pp. 28-30 • Look at the table on p. 29. • The two columns on the far left could be used to create a histogram. • The fourth and fifth columns are of particular importance when we want to construct an ogive. • Look over these briefly to see that you know where these data come from. • Steps 2 and 3: • 3: Plot a point corresponding to the relative cumulative frequency in each class interval at the left endpoint of the next class interval.
Practice problem • Exercise 1.19, p. 31 • Create a frequency table. • Then, create cumulative frequency and relative cumulative frequency columns.
Time plots • Used to plot the value of a variable vs. the time in was measured. • Can detect seasonal variation, for instance (See Figure 1.15, p. 32) • Used effectively in designed experiments. • Practice problem: Exercise 1.21, p. 33 • Use your calculator—the line graph function.
Homework • Reading: pp. 37-47 • Exercise: • 1.29, p. 36 • 1.1 Quiz on Friday • Probably 20-30 minutes
Section 1.1 Review Problems:Displaying Distributions with Graphs • Exercises, pp. 34-36: • 1.23, 1.24, 1.27, 1.28, 1.30
1.2 Describing Distributions with Numbers • Measuring center • Mean, median • Measuring spread • With quartiles: Inter-quartile range • Standard deviation • Range • Statistical summaries • Boxplots
Measuring the Center of a Distribution • Mean • Numerical average • Median • Middle value in a data set, if an odd number of values, or the average of the middle two values, if an even number of values • Splits the distribution exactly in half
Practice • Exercise 1.14, p. 23 • Create a histogram using your calculator • Discuss shape, center, and spread of the distribution. • Calculate 1-variable statistics using your calculator. • Discuss difference between the mean and median.
Resistant Measure • The mean cannot resist the influence of extreme observations. The median can, however. • Mean: not resistant • Median: Resistant • We generally prefer to use the median when dealing with skewed distributions.
Powerful Numerical Summary • The 5-number summary … • min • Q1 • Median (Q2) • Q3 • Max • … Plus mean and standard deviation
Review HW problem • Exercise 1.36, p. 47
The Boxplot • Let’s use the data from Exercise 1.36 to create a boxplot: • Now, use your calculator to create a modified boxplot (box on page 46). • Outliers
Comparing Distributions • Draw side-by-side boxplots to compare distributions. • Exercise 1.36, p. 47