1 / 25

Inferential Statistics 2

Inferential Statistics 2. Maarten Buis January 11, 2006. outline. Student Recap Sampling distribution Hypotheses Type I and II errors and power testing means testing correlations. Sampling distribution. PrdV example from last lecture.

lundy
Download Presentation

Inferential Statistics 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inferential Statistics 2 Maarten Buis January 11, 2006

  2. outline • Student Recap • Sampling distribution • Hypotheses • Type I and II errors and power • testing means • testing correlations

  3. Sampling distribution • PrdV example from last lecture. • If H0 is true, than the population consists of 16 million persons of which 41% (=6.56 million persons) supports de PrdV. • I have drawn 100,000 random samples of 2,598 persons each and compute the average support in each sample.

  4. Sampling distribution • 5% or 50,000 samples have a mean of 39% or less. • So if we reject H0 when we find a support of 39% or less than we will have a 5% chance of making an error. • Notice: We assume that the only reason we would make an error is random sampling error.

  5. More precise approach • We want to know the score below which only 5% of the samples lie. • Drawing lots of random samples is a rather rough approach, an alternative approach is to use the theoretical sampling distribution. • The proportion is a mean and the sampling distribution of a mean is the normal distribution with a mean equal to the H0 and a standard deviation (called standard error) of

  6. More precise approach • For a standard normal distribution we know the z-score below which 5% of the samples lie (Appendix 2, table A): -1.68 • So if we compute a z-score for the observed value (.31) and it is below -1.68 we can reject the H0, and we will do so wrongly in only 5% of the cases

  7. More precise approach • m is the mean of the sampling distribution, so .41 (H0) • se is , s of a proportion is • so the se is • so the z-score is • -10.4 is less than -1.68, so we reject the H0

  8. Null Hypothesis • A sampling distribution requires you to imagine what the population would look like if H0 is true. • This is possible if H0 is one value (41%) • This is impossible if H0 is a range (<41%) • So H0 should always contain a equal sign (either = or ≤ or ≥)

  9. Null hypothesis • In practice the H0 is almost always 0, e.g.: • difference between two means is 0 • correlation between two variables is 0 • regression coefficient is 0 • This is so common that SPSS always assumes that this is the H0.

  10. Undirected Alternative Hypotheses • Often we have an undirected alternative hypothesis, e.g.: • the difference between two means is not zero (could be either positive or negative) • the correlation between two variables is not zero (could be either positive or negative) • the regression coefficient is not zero (could be either

  11. Directed alternative hypothesis • In the PrdV example we had a directed alternative hypothesis: Support for PrdV is less than 41%, since PrdV would have still participated if his support were more than 41%.

  12. Type I and Type II errors

  13. Type I error rate • You choose the type I error rate (a) • It is independent of sample size, type of alternative hypothesis, or model assumptions.

  14. Type I versus type II error rate • a low probability of rejecting H0 when H0 is true (type I error), is obtained by: • rejecting the H0 less often, • Which also means a higher probability of not rejecting H0 when H0 is false (type II error), • In other words: a lower probability of finding a significant result when you should (power).

  15. How to increase your power: • Lower type I error rate • Larger sample size • Use directed instead of undirected alternative hypothesis • Use more assumptions in your model (non-parametric tests make less assumptions, but are also have less power)

  16. Testing means • What kind of hypotheses might we want to test: • Average rent of a room in Amsterdam is 300 euros • Average income of males is equal to the average income of females

  17. Z versus t • In the PrdV example we knew everything about the sampling distribution with only an hypothesis about the mean. • In the rent example we don’t: we have to estimate the standard deviation. • This adds uncertainty, which is why we use the t distribution instead of the normal • Uncertainty declines when sample size becomes larger. • In large samples (N>30) we can use the normal.

  18. t-distribution • It has a mean and standard error like the normal distribution. • It also has a degrees of freedom, which depends on the sample size • The larger the degrees of freedom the closer the t-distribution is to the normal distribution.

  19. Data: rents of rooms

  20. Rent example • H0: m=300, HA: m ≠ 300 • We choose a to be 5% • N = 19, so df= 18 • We reject H0 if we find a t less than -2.101 or more than 2.101 (appendix B, table 2) • We do not reject H0 if we find a t between -2.101 and 2.101 .

  21. Rent example • We use s2 as an estimate of s2 • So • -1.85 is between -2.101 and 2.101, so we do not reject H0

  22. Compare means in SPSS

  23. Do before Monday • Read Chapter 9 and 10 • Do the “For solving Problems”

More Related