300 likes | 314 Views
Chapter 9: Introduction to Inference. Thumbtack Activity. Toss your thumbtack in the air and record whether it lands either point up (U) or point down (D). Do this 25 times (n=25). Calculate p-hat. Repeat the above process two more times, for a total of three estimates.
E N D
Thumbtack Activity • Toss your thumbtack in the air and record whether it lands either point up (U) or point down (D). Do this 25 times (n=25). • Calculate p-hat. • Repeat the above process two more times, for a total of three estimates. • Record your p-hat on a separate post-it note.
We’ve just begun a sampling distribution. • Strictly speaking, a sampling distribution is: • A theoretical distribution of the values of a statistic (in our case, the proportion) in all possible samples of the same size (n=25 here) from the same population. • Sampling Variability: • The value of a statistic varies from sample-to-sample in repeated random sampling. • We do not expect to get the same exact value for the statistic for each sample!
Definitions • Parameter: • A number that describes the population of interest. • Rarely do we know its value, because we do not (typically) have all values of all individuals from a population. • We use µ and σ for the mean and standard deviation of a population. • P and σp for proportions. • Statistic: • A number that describes a sample. We often use a statistic to estimate an unknown parameter. • We use x-bar and s for the mean and standard deviation of a sample. • P-hat and σp-hat for proportions.
Sampling Distribution • The sampling distribution answers the question, “What would happen if we repeated the sample or experiment many times?” • Formal statistical inference is based on the sampling distribution of statistics.
Inference • Inference is the statistical process by which we use information collected from a sample to infer something about the population of interest. • Two main types of inference: • Interval estimation (Section 9.1) • Tests of significance (Section 9.2)
Constructing Confidence Intervals • Back to the thumbtack activity … • Interpretation of 95% C.I.: • If the sampling distribution is approximately normal, then the 68-95-99.7 rule tells us that about 95% of all p-hat values will be within two standard deviations of p (upon repeated samplings). If p-hat is within two standard deviations of p, then p is within two standard deviations of p-hat. So about 95% of the time, the confidence interval will contain the true population parameter p.
Internet Demonstration, C.I. • http://bcs.whfreeman.com/yates/pages/bcs-main.asp?s=00020&n=99000&i=99020.01&v=category&o=&ns=0&uid=0&rau=0
Interpretation of 95% CI (Commit to memory!) • 95% of all confidence intervals constructed in the same manner will contain the true population parameter. • 5% of the time they will not.
Practice • See example 9.3, p. 495 • Exercises 9.1-9.4, p. 495
Creating the C.I. • Estimate +/- Margin of error
Another practice problem • 9.5, p. 496
Practice • 9.9 and 9.10, p. 505
9.2 Significance Testing • An evolutionary psychologist at Harvard University claims that 80% (p=0.80) of American adults believes in the theory of evolution. To test his claim, he takes an SRS of 1,120 adults. Here are the results: • 851 said “Yes” when asked, “Do you believe in the theory of evolution?” • What is the proportion who said yes? • Is this enough evidence to say that the proportion of adults who do not believe in the theory of evolution is different from 0.80?
Example, cont. • This requires a significance test: • Hypotheses: • Ho: p=0.80 • Ha: p≠0.80 • Let’s use our calculators to conduct the appropriate test: • 5: 1-prop ztest
P-value Example Results
Hypotheses Alternate hypothesis Ha: Can be one-sided (Ha: p> some number or p< some number) or two-sided (Ha: p≠ some number)
HW • 9.24-9.26, p. 521 • Reading: pp. 509-525
Sampling Applet • http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/