Sociology 601 Class 9: September 29, 2009

Sociology 601 Class 9: September 29, 2009 • 7.2: Difference between two large sample proportions. • 7.3: Small sample comparisons for two independent groups. • Difference between two small sample means • Difference between two small sample proportions • Stata practice

7.3: small sample comparison -Treatments for depression A clinical psychologist wants to choose between two therapies for treating severe cases of mental depression. • Therapy A: existing therapy, 6 subjects • improvement scores: +10 +10 +20 +20 +30 +30 • Ybar = 20 • standard deviation = √(Σ(Yi – Ybar)2 / (n-1)) = √80 = 8.944 • Therapy B: new therapy, 3 subjects • improvement scores: +30 +30 +45 • Ybar = 35 • standard deviation = √(Σ(Yi – Ybar)2 / (n-1)) = √75 = 8.660

7.3 Interactive STATA command for ttesti *an immediate test for the depression exercise in lecture 7.3 *n1, mean1, s.d.1, n2, mean2, s.d.2 ttesti 6 20 8.944 3 35 8.660 Two-sample t test with equal variances ------------------------------------------------------------------------------ | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- x | 6 20 3.651373 8.944 10.61385 29.38615 y | 3 35 4.999853 8.66 13.48737 56.51263 ---------+-------------------------------------------------------------------- combined | 9 25 3.726718 11.18015 16.40617 33.59383 ---------+-------------------------------------------------------------------- diff | -15 6.267643 -29.82062 -.1793794 ------------------------------------------------------------------------------ Degrees of freedom: 7 Ho: mean(x) - mean(y) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -2.3932 t = -2.3932 t = -2.3932 P < t = 0.0240 P > |t| = 0.0479 P > t = 0.9760

7.3 STATA command for ttest, using a data set . * next, perform the t-test . * ttest for two means using a stored data set . ttest impscore, by(treat) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 1 | 6 20 3.651484 8.944272 10.61356 29.38644 2 | 3 35 5 8.660254 13.48674 56.51326 ---------+-------------------------------------------------------------------- combined | 9 25 3.72678 11.18034 16.40603 33.59397 ---------+-------------------------------------------------------------------- diff | -15 6.267832 -29.82107 -.1789331 ------------------------------------------------------------------------------ Degrees of freedom: 7 Ho: mean(1) - mean(2) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -2.3932 t = -2.3932 t = -2.3932 P < t = 0.0240 P > |t| = 0.0479 P > t = 0.9760

Significance test for treatment of depression:1. Assumptions • samples chosen at random from their respective populations • variables have an interval scale (the difference between 10 and 20 is the same as the difference between 20 and 30) • the underlying populations are normally distributed • (How bad is it if these populations are not normally distributed?) • The standard deviations of the two populations are the same (!!)

Significance test for treatment of depression:2. Hypothesis • The null hypothesis is that there is no difference between the treatment effects across the population, so any differences between samples are due to random error. • H0: μ2 – μ1 = 0

Significance testfor treatment of depression:3. Test Statistic • Test statistic: • To solve this equation, we need an equation for the standard error of the difference between two sample means. There are two alternatives: • Do not assume that the populations have equal variances. (Consistent with a strict interpretation of Ho, but sometimes one of the small samples turns out to have a tiny or huge sample standard deviation) • Assume that the populations have equal variances. (Not really part of Ho, but produces more stable results.)

Alternate forms of the standard error

Back to the test statistic • This is how you would do a t-test for comparing small sample means, assuming equal variance.

Significance test for treatment of depression:4. P value • t = 2.394, df = 7. • Table on p. 669, use the row for df = 7. • Move across columns to the t-score below and above 2.394. These are the third and fourth columns. • Find the one-sided p-value (in subscripts) for the bracketing t-scores. I determine .025 > p > .01 for a one-sided test. • In this case, we want a two-sided test, so we double the p-value. .05 > p > .02 for a two-sided test.

Significance test for treatment of depression:5. Conclusion • p < .05 for a two-sided test. • Because a difference this great would occur less than 5% of the time purely by chance, we reject the null hypothesis and conclude that the populations do not have the same mean. • In practical terms, the improvement scores average 15 points higher under treatment B. We don’t know the metric of the improvement scores, but a 15 point difference is almost as large as the total improvement score for treatment A, and is larger than the standard deviation for improvement scores.

Confidence interval for a small sample comparison • As expected, the 95% confidence interval does not include 0

Small-sample inference: comparison of population proportions With a small-sample comparison of population proportions, we are in the same fix that we were in for a test of a single population proportion. • With a categorical outcome, we cannot assume that the population has a normal distribution. • With a small sample size, the central limit theorem cannot assure us that the sampling distribution is normal. • Our only option with small sample proportions is to painstakingly calculate the probability of each outcome by hand (or have STATA do it).

Small sample inference: Comparing population proportions • A recent study compared adults who had been raised as children in lesbian families with adults who had been raised by heterosexual mothers. • In a sample from 20 heterosexual mothers, 4 adult children reported ever having a same-sex sexual attraction, and 16 did not. • In a sample from 25 lesbian mothers, 9 adult children reported ever having a same-sex sexual attraction, and 16 did not. • Are children of lesbian mothers more or less likely than children of other mothers to report a same-sex sexual attraction?

STATA command for Fisher’s Exact Test using TABI . * immediate Fisher’s exact test for population proportions . tabi 4 16 \ 9 16 | col row | 1 2 | Total -----------+----------------------+---------- 1 | 4 16 | 20 2 | 9 16 | 25 -----------+----------------------+---------- Total | 13 32 | 45 Fisher's exact = 0.327 1-sided Fisher's exact = 0.200

STATA command for Fisher’s Exact Test using an existing data set * Fisher's exact test using a data set . tabulate lbimom attract, exact | attract lbimom | 0 1 | Total -----------+----------------------+---------- 0 | 16 4 | 20 1 | 16 9 | 25 -----------+----------------------+---------- Total | 32 13 | 45 Fisher's exact = 0.327 1-sided Fisher's exact = 0.200

Hypothesis test using Fisher’s exact test: • Assumptions: We assume that the observations are taken from random samples, and that the mothers and their adult children fall into one category or the other, but not both. • Hypothesis: There is no difference in the proportions of adult children who report a same-sex attraction, based on the sexual orientation of the mother. H0: 2 - 1 = 0 • Test statistic: none • P-value: The null as stated produces a 2-tailed p-value of 0.327. (If we had stated a one-sided hypothesis: “Children of lesbian mothers are no more likely to have a same sex attraction than children of other mothers”, then p =.200.) • Conclusion: do not reject the null hypothesis that the two populations have the same proportion reporting a same sex attraction.

Formula for Fisher’s exact test: • This formula is not for any homework or test, but it may help you understand what is happening. • Given the following table: Mother attract no attract TOTAL B/L mom 9 16 R1 = 25 No B/L mom 4 16 R2 = 20 TOTAL C1 = 13 C2 = 32 n = 45 • The “null” probability for the underlined cell is 11=R1C1/ n • With plenty of algebra, the binomial expansion solves to… • Pr(O11=9, O12=16, O21=4, O22=16) • = (R1! R2! C1! C2! / n!*9!*16!*4!*16!) = .1356 • Calculate this equation for all possible 2X2 distributions based on the observed row and column totals. (In this case, there are 13 possible.) • For a two-tailed test, sum all probabilities at least as unlikely as the observed probability.

Sociology 601 Class 9: September 29, 2009

Sociology 601 Class 9: September 29, 2009

Presentation Transcript

Welcome To Rural Sociology 1000 Introduction to Rural Sociology

SOCIOLOGY CAREER PATHS

Resources for Research in Sociology

“Incentives--Win/Win/Win for Employers/Insurers, Physicians and Employees” by Jeff Greene September 17,2009

Get Hand Outs—next 2 projector Mini Quiz #1 9/2/09

Classroom Assessment Scoring System CLASS

LONG Tom Peters’ Excellence. Always. Glasgow/01 September 2009

Chapter 1

Cultivating Success Conference Sacramento, CA September, 2009

INTRODUCTION TO SOCIOLOGY

The Sociology of Emile Durkheim

UPDATE IN WOMEN’S HEALTH 2009 September 11, 2009

The Sociology of Max Weber

Sociology for Health care Professionals

Tom Petersâ€™ Excellence. Always. Manchester/02 September 2009

Fish Morphology

NHICEP Conference NH Hospital Association September 15, 2009

SOCIOLOGY 202 LIBRARY INSTRUCTION

Objectives of Unit 2:

What is Sociology?

Abortion and Public Opinion