300 likes | 427 Views
AP Exam Prep: Essential Notes. Chapter 11: Inference for Distributions. 11.1 Inference for Means of a Population 11.2 Comparing Two Means. Standard error of the mean. Moving away from z …. In chapter 10, when we knew σ , we calculated a z-score for a particular mean as follows:.
E N D
Chapter 11: Inferencefor Distributions 11.1 Inference for Means of a Population 11.2 Comparing Two Means
Standard error of the mean Moving away from z … • In chapter 10, when we knew σ, we calculated a z-score for a particular mean as follows: • Now, we do not know σ, so we calculate a t-score, which provides somewhat of a “fudge-factor” because we do not know σ, but must estimate it from the sample :
One-sample t-procedures (p. 622) • Confidence interval: • Hypothesis test: • In both cases, σ is unknown.
Matched Pairs t Procedures • Matched pairs designs: subjects are matched in pairs and each treatment is given to one subject in the pair (randomly). • One type of matched pairs design is to have a group of subjects serve as their own pair-mate. Each subject then gets both treatments (randomize the order). • Apply one-sample t-procedures to the observed differences. • Example 11.4, p. 629 • Note H0 • Look at Figure 11.7, p. 631
Conditions for Inference about a Mean (p. 617) • SRS • Observations from the population have a normal distribution with mean µ and standard deviation σ. • Symmetric and single-peaked essential.
Using t-procedures • See Box, p. 636 • SRS very important! • n<15: do not use t-procedures if the data are clearly non-normal or if outliers are present. • n at least 15: t-procedures can be used except in the presence of outliers or strong skewness. • n at least 40: t-procedures can be used for even clearly skewed distributions. • By CLT
11.2 Comparing Two Means • The goal of two-sample inference problems is to compare the responses of two treatments or to compare the characteristics of two populations. • We must have a separate sample from each treatment or each population. • Unlike the matched-pairs designs. • A two-sample problem can arise from a randomized comparative experiment that randomly divides subjects into two groups and exposes each group to a different treatment.
Conditions for Significance TestsComparing Two Means (p. 650) • Two SRSs from distinct populations. • Samples are independent (matching violates this assumption). • We measure the same variable for each sample. • Both populations are normally distributed. • Means and standard deviations of both are unknown.
=0 for the H0:µ1=µ2 Two-sample t-test • The appropriate t-statistic is as follows. The degrees of freedom calculation is complex; we will use our calculators to provide this for us (the df are usually not whole numbers for two-sample tests).
Two-sample confidence intervalfor µ1-µ2 • Draw an SRS of size n1 from a normal population with unknown mean µ1, and draw an independent SRS of size n2 from a normal population with unknown mean µ2. The confidence interval for µ1-µ2 is given by the following: • Again, we need the df for t*, but we will let the calculator do that for us.
Using t-procedures for two-sample analyses • See Box, p. 636 • SRS very important! • n1+n2<15: do not use t-procedures if the data are clearly non-normal or if outliers are present. • n1+n2 at least 15: t-procedures can be used except in the presence of outliers or strong skewness. • n1+n2 at least 40: t-procedures can be used for even clearly skewed distributions. • By CLT
Chapter 12: Inference for Proportions 12.1 Inference for a Population Proportion 12.2 Comparing Two Proportions
Conditions for Inference abouta Proportion (p. 687) • SRS • N at least 10n • For a significance test of H0:p=p0: • The sample size n is so large that both np0 and n(1-p0) are at least 10. • For a confidence interval: • n is so large that both the count of successes, n*p-hat, and the count of failures, n(1 - p-hat), are at least 10.
Normal Sampling Distribution • If these conditions are met, the distribution of p-hat is approximately normal, and we can use the z-statistic:
Inference for a Population Proportion • Confidence Interval: • Significance test of H0: p=p0:
Choosing a Sample Size (p. 695) • Our guess p* can be from a pilot study, or we could use the most conservative guess of p*=0.5. • Solve for n. • Example 12.9, p. 696.
Conditions: Confidence Intervals for Comparing Two Proportions • SRS from each population • N>10n • All of these are at least 5:
Calculating a Confidence Interval for Comparing Two Proportions (p. 704)
Significance Tests forComparing Two Proportions • The test statistic is: • Where,
Conditions: Significance Test for Comparing Two Proportions • SRS from each population • N>10n • All of these are at least 5:
Chapter 13: Chi-Square Procedures 13.1 Test for Goodness of Fit 13.2 Inference for Two-Way Tables
M&Ms Example • Sometimes we want to examine the distribution of proportions in a single population. • As opposed to comparing distributions from two populations, as in Chapter 12. • Does the distribution of colors in your bags match up with expected values? • We can use a chi-square goodness of fit test. • Χ2 • We would not want to do multiple one-proportion z-tests. • Why?
Performing a X2 Test 1. H0: the color distribution of our M&Ms is as advertised: Pbrown=0.30, Pyellow=Pred=0.20, and Porange=Pgreen=Pblue=0.10 Ha: the color distribution of our M&Ms is not as advertised. • Conditions: • All individual expected counts are at least 1. • No more than 20% of expected counts are less than 5. • Chi-square statistic:
Example 13.4, pp. 744-748 • Is there a difference between proportion of successes? • At left is a two-way table for use in studying this question. • Explanatory Variable: • Type of Treatment • Response Variable: • Proportion of no relapses
Expected Counts and Conditions • All expected counts are at least 1, no more than 20% less than 5.
Given in output from stats package. Confidence Intervals for the Regression Slope (p. 788) • If we repeated our sampling and computed another model, would we expect a and b to be exactly the same? • Of course not, given what we’ve learned about random variation and sampling error! • We are interested in the true slope (β), which is unknowable, but we are able to estimate it. • Confidence Interval for the slope β of the true regression line:
Is β=0? • H0: β=0 vs. Ha: β ≠0 or β>0 or β<0 • Perform a t-test: