80 likes | 258 Views
G89.2229 Lect 9M. Example: Survey of Puerto Rican Adolescents. Parents of1314 Adolescents between the age of 11 and 18 were asked about their child's depression symptoms.About half the adolescents were sampled from communityAbout half were sampled from state-sponsored managed care for mental healt
E N D
1. G89.2229 Lect 9M Example
Nominal (Categorical) Variables as Explanatory Factors
Coding of Nominal Explanatory Variables G89.2229 Multiple Regression Week 9 (Monday)
2. G89.2229 Lect 9M Example: Survey of Puerto Rican Adolescents Parents of1314 Adolescents between the age of 11 and 18 were asked about their child's depression symptoms.
About half the adolescents were sampled from community
About half were sampled from state-sponsored managed care for mental health
In this analysis, we ignore sampling frame
Descriptive question: Do symptoms differ across gender and four age groups?
11-12; 13-14; 15-16; 17-18
3. G89.2229 Lect 9M A graphic review of t test Y=B0 + B1D + e, where D = 0 or 1
E(Y|D=0) = B0 (The group 1 mean)
E(Y|D=1) = B0 + B1 (Group 2 mean)
Since B0 is the group 1 mean, B1 must be the difference between the means of groups 1 & 2.
4. G89.2229 Lect 9M Setting up Dummy Variables With two groups, it makes little difference which group is assigned the value of zero.
As a rule, it is useful to assign zero to the natural reference group (control group, normal group, ideal state, and so on)
If the group assignment is switched, then the sign of the coefficient simply changes.
The regression-based mean difference can be tested as a Wald test (the usual t test).
5. G89.2229 Lect 9M Dummy Variables with k groups Suppose we have three or more groups. How does dummy logic extend?
For k groups, construct (k-1) dummy variables
Choose one group to be the reference group.
Largest is good choice
For all nonreference groups, define
Di=1 if subject is in group i
Di=0 otherwise
The members of the reference group will have 0 on all k-1 dummy variables.
6. G89.2229 Lect 9M Algebraic interpretation of general case The multiple regression equation
Y=B0+ B1D1+...+ Bk-1Dk-1+e
Suppose we call the reference group, Group k. Persons in that group have all Ds zero.
E(Y|Grp=k) = B0
For persons in Group i,
E(Y|Grp=i) = B0 + Bi
Bi is the difference between that groups mean and the reference mean.
These interpretations are only possible if whole BLOCK of dummy variables are in equation.
Test of R2 associated with that block of variables gives usual k-group ANOVA F on (k-1, N-k) df.
7. G89.2229 Lect 9M Setting up dummy codes for age groups There are no missing data on age. SPSS syntax:
COMPUTE AGE18=0.
COMPUTE AGE16=0.
COMPUTE AGE14=0.
IF AGE EQ 18 AGE18=1.
IF AGE EQ 16 AGE16=1.
IF AGE EQ 14 AGE14=1.
EXECUTE.
Age group 12 is used as reference category
8. G89.2229 Lect 9M An alternative coding scheme: Unweighted Effect-codes When there is no natural reference category, then ANOVA lovers would rather compare each group mean to a grand mean.
For two groups:
Create a single Effect Code,C
If Group 1, C= 1If Group 2, C= -1.
Y = B0 + B0 C + e (for both groups)
E(Y|C=1) = B0 + B1
E(Y|C=-1) = B0 - B1
For k groups, the reference category is always scored -1
9. G89.2229 Lect 9M Another coding scheme: Weighted Effect-codes Like unweighted effect codes, but compares groups to mean of raw observations rather than mean of unweighted means.
For two groups of size n1, n2:
Create a single Effect Code,C
If Group 1, C= 1If Group 2, C= -n1/n2.
Y = B0 + B0 C + e (for both groups)
E(Y|C=1) = B0 + B1
E(Y|C=-1) = B0 -(n1/n2)B1
For k groups, the reference category is always scored (ni/nk) for the ith variable.