1 / 18

Sociology 601 Class 9: September 29, 2009

Sociology 601 Class 9: September 29, 2009. 7.2: Difference between two large sample proportions. 7.3: Small sample comparisons for two independent groups. Difference between two small sample means Difference between two small sample proportions Stata practice.

karim
Download Presentation

Sociology 601 Class 9: September 29, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sociology 601 Class 9: September 29, 2009 • 7.2: Difference between two large sample proportions. • 7.3: Small sample comparisons for two independent groups. • Difference between two small sample means • Difference between two small sample proportions • Stata practice

  2. 7.3: small sample comparison -Treatments for depression A clinical psychologist wants to choose between two therapies for treating severe cases of mental depression. • Therapy A: existing therapy, 6 subjects • improvement scores: +10 +10 +20 +20 +30 +30 • Ybar = 20 • standard deviation = √(Σ(Yi – Ybar)2 / (n-1)) = √80 = 8.944 • Therapy B: new therapy, 3 subjects • improvement scores: +30 +30 +45 • Ybar = 35 • standard deviation = √(Σ(Yi – Ybar)2 / (n-1)) = √75 = 8.660

  3. 7.3 Interactive STATA command for ttesti *an immediate test for the depression exercise in lecture 7.3 *n1, mean1, s.d.1, n2, mean2, s.d.2 ttesti 6 20 8.944 3 35 8.660 Two-sample t test with equal variances ------------------------------------------------------------------------------ | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- x | 6 20 3.651373 8.944 10.61385 29.38615 y | 3 35 4.999853 8.66 13.48737 56.51263 ---------+-------------------------------------------------------------------- combined | 9 25 3.726718 11.18015 16.40617 33.59383 ---------+-------------------------------------------------------------------- diff | -15 6.267643 -29.82062 -.1793794 ------------------------------------------------------------------------------ Degrees of freedom: 7 Ho: mean(x) - mean(y) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -2.3932 t = -2.3932 t = -2.3932 P < t = 0.0240 P > |t| = 0.0479 P > t = 0.9760

  4. 7.3 STATA command for ttest, using a data set . * next, perform the t-test . * ttest for two means using a stored data set . ttest impscore, by(treat) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 1 | 6 20 3.651484 8.944272 10.61356 29.38644 2 | 3 35 5 8.660254 13.48674 56.51326 ---------+-------------------------------------------------------------------- combined | 9 25 3.72678 11.18034 16.40603 33.59397 ---------+-------------------------------------------------------------------- diff | -15 6.267832 -29.82107 -.1789331 ------------------------------------------------------------------------------ Degrees of freedom: 7 Ho: mean(1) - mean(2) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -2.3932 t = -2.3932 t = -2.3932 P < t = 0.0240 P > |t| = 0.0479 P > t = 0.9760

  5. Significance test for treatment of depression:1. Assumptions • samples chosen at random from their respective populations • variables have an interval scale (the difference between 10 and 20 is the same as the difference between 20 and 30) • the underlying populations are normally distributed • (How bad is it if these populations are not normally distributed?) • The standard deviations of the two populations are the same (!!)

  6. Significance test for treatment of depression:2. Hypothesis • The null hypothesis is that there is no difference between the treatment effects across the population, so any differences between samples are due to random error. • H0: μ2 – μ1 = 0

  7. Significance testfor treatment of depression:3. Test Statistic • Test statistic: • To solve this equation, we need an equation for the standard error of the difference between two sample means. There are two alternatives: • Do not assume that the populations have equal variances. (Consistent with a strict interpretation of Ho, but sometimes one of the small samples turns out to have a tiny or huge sample standard deviation) • Assume that the populations have equal variances. (Not really part of Ho, but produces more stable results.)

  8. Alternate forms of the standard error

  9. Back to the test statistic • This is how you would do a t-test for comparing small sample means, assuming equal variance.

  10. Significance test for treatment of depression:4. P value • t = 2.394, df = 7. • Table on p. 669, use the row for df = 7. • Move across columns to the t-score below and above 2.394. These are the third and fourth columns. • Find the one-sided p-value (in subscripts) for the bracketing t-scores. I determine .025 > p > .01 for a one-sided test. • In this case, we want a two-sided test, so we double the p-value. .05 > p > .02 for a two-sided test.

  11. Significance test for treatment of depression:5. Conclusion • p < .05 for a two-sided test. • Because a difference this great would occur less than 5% of the time purely by chance, we reject the null hypothesis and conclude that the populations do not have the same mean. • In practical terms, the improvement scores average 15 points higher under treatment B. We don’t know the metric of the improvement scores, but a 15 point difference is almost as large as the total improvement score for treatment A, and is larger than the standard deviation for improvement scores.

  12. Confidence interval for a small sample comparison • As expected, the 95% confidence interval does not include 0

  13. Small-sample inference: comparison of population proportions With a small-sample comparison of population proportions, we are in the same fix that we were in for a test of a single population proportion. • With a categorical outcome, we cannot assume that the population has a normal distribution. • With a small sample size, the central limit theorem cannot assure us that the sampling distribution is normal. • Our only option with small sample proportions is to painstakingly calculate the probability of each outcome by hand (or have STATA do it).

  14. Small sample inference: Comparing population proportions • A recent study compared adults who had been raised as children in lesbian families with adults who had been raised by heterosexual mothers. • In a sample from 20 heterosexual mothers, 4 adult children reported ever having a same-sex sexual attraction, and 16 did not. • In a sample from 25 lesbian mothers, 9 adult children reported ever having a same-sex sexual attraction, and 16 did not. • Are children of lesbian mothers more or less likely than children of other mothers to report a same-sex sexual attraction?

  15. STATA command for Fisher’s Exact Test using TABI . * immediate Fisher’s exact test for population proportions . tabi 4 16 \ 9 16 | col row | 1 2 | Total -----------+----------------------+---------- 1 | 4 16 | 20 2 | 9 16 | 25 -----------+----------------------+---------- Total | 13 32 | 45 Fisher's exact = 0.327 1-sided Fisher's exact = 0.200

  16. STATA command for Fisher’s Exact Test using an existing data set * Fisher's exact test using a data set . tabulate lbimom attract, exact | attract lbimom | 0 1 | Total -----------+----------------------+---------- 0 | 16 4 | 20 1 | 16 9 | 25 -----------+----------------------+---------- Total | 32 13 | 45 Fisher's exact = 0.327 1-sided Fisher's exact = 0.200

  17. Hypothesis test using Fisher’s exact test: • Assumptions: We assume that the observations are taken from random samples, and that the mothers and their adult children fall into one category or the other, but not both. • Hypothesis: There is no difference in the proportions of adult children who report a same-sex attraction, based on the sexual orientation of the mother. H0: 2 - 1 = 0 • Test statistic: none • P-value: The null as stated produces a 2-tailed p-value of 0.327. (If we had stated a one-sided hypothesis: “Children of lesbian mothers are no more likely to have a same sex attraction than children of other mothers”, then p =.200.) • Conclusion: do not reject the null hypothesis that the two populations have the same proportion reporting a same sex attraction.

  18. Formula for Fisher’s exact test: • This formula is not for any homework or test, but it may help you understand what is happening. • Given the following table: Mother attract no attract TOTAL B/L mom 9 16 R1 = 25 No B/L mom 4 16 R2 = 20 TOTAL C1 = 13 C2 = 32 n = 45 • The “null” probability for the underlined cell is 11=R1C1/ n • With plenty of algebra, the binomial expansion solves to… • Pr(O11=9, O12=16, O21=4, O22=16) • = (R1! R2! C1! C2! / n!*9!*16!*4!*16!) = .1356 • Calculate this equation for all possible 2X2 distributions based on the observed row and column totals. (In this case, there are 13 possible.) • For a two-tailed test, sum all probabilities at least as unlikely as the observed probability.

More Related