250 likes | 497 Views
Inferential statistics by example. Maarten Buis Monday 2 January 2005. Two statistics courses. Descriptive Statistics (McCall, part 1) Inferential Statistics (McCall, part 2 and 3). Course Material. McCall: Fundamental Statistics for Behavioral Sciences. SPSS (available from Surfspot.nl)
E N D
Inferential statistics by example Maarten Buis Monday 2 January 2005
Two statistics courses • Descriptive Statistics (McCall, part 1) • Inferential Statistics (McCall, part 2 and 3)
Course Material • McCall: Fundamental Statistics for Behavioral Sciences. • SPSS (available from Surfspot.nl) • Lectures: 2 x a week • computer labs: 1 x a week. • course website
setup of lectures • Recap of material assumed to be known • New Material • Student Recap
How to pass this course • Read assigned portions of McCall before each lecture • Do the exercises • Do the computer lab assignments, and hand them in before Tuesday 17:00! • come to the computer lab • come to the lectures • ask questions: during class or to the course mailing list
What is inference? • Drawing general conclusions from partial information • Based on your observations some conclusions are more plausible than others. • Compare with logic
Sources of uncertainty in inference • Sample • Measurement • Model • Typos when typing the data into SPSS • Inference, as discussed here, assumes that random sampling error is by far the most dominant source of uncertainty.
How is inference done? • If a null hypothesis is true than the probability of observing the data is so small that either we have drawn a very weird sample or the null hypothesis is false. (Ronald Fisher) • We use a “good” procedure to choose between two hypotheses, whereby “good” means that you draw the right conclusion in 95% of the times you use that procedure. (Jerzy Neyman and Egon Pearson)
PrdV • New populist party, wanted to participate in the next election if 41% of the Dutch population thought that “the PrdV would be an asset to Dutch politics”. • This was asked to a sample of 2,598 people between, and on 16 December only 31% agreed. • Peter R. de Vries decided not to participate in the next election.
The Inference Problem • The 31% people approving is 31% of the people in the sample. • Peter R. de Vries doesn’t care about what people in the sample think, he cares about what all the people in the Netherlands think. • Could it be that he has drawn a “weird” sample, and that in the Netherlands 41% or more really think he would be an asset to Dutch politics?
Two hypotheses • H0: 41% or more support PrdV • HA: less than 41% support PrdV
A thought experiment (1) • If support for PrdV in the Netherlands is 41% and we draw 100 random samples of 2598 persons, than we get 100 estimates of the support for PrdV, some of them a bit too high, some of them a bit too low. • We would expect that 5 samples would show a support for PrdV of 39% or less. • If we find a support for PrdV of 39% or less and reject H0, than we have followed a procedure that would result in taking the right decision in 95% of the times we used that procedure.
What does that 39% mean? • We propose the following procedure: If we find a support for PrdV of less than x% than reject H0 • We choose x in such a way that the probability of rejecting H0 when we shouldn’t is only 5% • The reason for mistakenly rejecting H0 is drawing a ‘weird’ sample.
Where does that 39% come from? • If H0 is true, than we draw a sample from a population in which the support for PrdV is 41% • We can let the computer draw many (100,000) samples and calculate the mean in each sample. • 50,000 or 5% of these samples have a mean of 39% or less. • So if we reject H0 when we find a support of 39% or less, than the probability of making a mistake is 5%
Where did that 39% come from? • If we draw many random samples, and compute the mean in each sample, than the distribution of these means will be approximately normally distributed with a mean of .41 and a standard deviation of • Remember that the sample size is 2598, and the SD of a proportion is , so the Standard Deviation of the distribution of means is • 5% of the samples has a support for PrdV of less than 39%
Neyman Pearson hypothesis testing • This procedure is the Neyman Pearson hypothesis testing approach • Note that it tells us something quality of the procedure we use to make a decision, not about the strength of evidence against H0
Thought experiment (2) • If the H0 is true, than the probability of drawing a sample of size 2598 with a support for PrdV of 31% or less is 1.041 x 10-25. • This is so small that we think it is safe to reject H0.
Where did that 1.041 x 10-25 come from? • In the 100,000 samples that were drawn from the population if H0 were true none were lees than .31% • So the probability of drawing this or a more extreme sample when H0 is true is less than 1/100,000. • Remember that if H0 is true, the distribution of means obtained from many samples is normal with a mean of .41 and a standard deviation of .0096 • The proportion of samples with a mean less than .31 is 1.041 x 10-25
Fisher hypothesis testing • This procedure is Fisher hypothesis testing. • Note that it gives us a measure of evidence against H0, but it does not give us an indication of how likely we are to make the wrong decision.
Fisher vs. Neyman Pearson • You will draw the same conclusion whichever method you use. • However, it really helps to choose one approach when writing your results down.
Limits to inference • More importantly, both assume random sampling, and we almost never have that. • Testing is more helpful to determine whether the data is ‘screaming’ or whispering’ at us. • Knowing the reasoning behind statistical inference will help you determine the weight you should assign to conclusions derived from statistical tests.
Terminology (1) • Distribution means obtained from different samples is the sampling distribution of the mean. • The standard deviation of the sampling distribution is the standard error. • Proportion of samples that wrongly reject the H0 is the significance level or a or Type I error rate. • Proportion of samples that wrongly fail to reject H0 is the Type II error rate or b. • Proportion of samples that will rightly reject H0 is the power.
Terminology (2) • The probability of the data given that H0 is true is the p-value. • Maximum p-value that will cause you to reject H0 is also the level of significance.
What to do before Wednesday? • Read Chapter 8 • Do exercises of chapter 8