250 likes | 354 Views
AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH. Chapter 7.3 Dummy Variables. Use of Dummy Variables (7.3). In many models, one or more of the independent variables is qualitative or categorical in nature
E N D
AAEC 4302ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 7.3 Dummy Variables
Use of Dummy Variables (7.3) • In many models, one or more of the independent variables is qualitative or categorical in nature • This means that they can only take on few values which magnitude has little or no meaning
Use of Dummy Variables • This type of independent variables have to be modeled through dummy variables • A set of dummy variables is created for each categorical independent variable X in the model, where the number of dummy variables in the set equals the number of categories in which that independent variable is classified
Use of Dummy Variables • In our biological example is the skull length (mm) of the ith mouse: • X1i sex: male or female (two categories), • X2i specie (three categories), and • X3 age. • Two dummy variables will be created for X1 (D11 and D12) and three for X2(D21, D22, and D23)
Use of Dummy Variables • In the ith observation (mouse): • , if sex is male, 0 otherwise; • , if sex is female, 0 otherwise; • , if specie 1, 0 otherwise; • , if specie 2, 0 otherwise; • and ( , if specie 3, 0 otherwise. X1 X2
Use of Dummy Variables • The estimated model would be: • Notice that the dummy variables corresponding to the last categories of X1 and X2 (D12 and D23) have been excluded from the estimated model (any one dummy/category can be excluded, it makes no difference) • If you don’t exclude a dummy variable from a group, it will contain redundant information.
Use of Dummy Variables • Notice that this model actually estimates a different intercept for each observed sex/specie combination, while maintaining the same slope parameters for each of the other independent variables in the model ( ) (only one -age or - in our example)
Use of Dummy Variables Model to estimate: Estimated Model:
Use of Dummy Variables • For a male mouse of the first specie: 1 1 0 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • For a male mouse of the second specie: 1 0 1 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • For a male mouse of the third specie: 1 0 0 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • For a female mouse of the first specie: 0 1 0 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • For a female mouse of the second specie: 0 0 1 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • For a female mouse of the third specie: 0 0 0 D11: 1 if sex = Male, 0 otherwise D21: 1 if species = 1, 0 otherwise D22: 1 if species = 2, 0 otherwise
Use of Dummy Variables • Notice, therefore, that: • measures the difference in skull length (for any specie and age) between male and female mice • measures the difference in skull length (for male or female mice of any age) between species one and three
Use of Dummy Variables • measures the difference in skull length (male or female mice of any age) between species two and three • measures the difference in skull length (for male or female mice of any age) between species one and two
Use of Dummy Variables • measures the skull length for a female mouse of specie 3 at birth, i.e. age = 0 • : The skull length for a female mouse of specie 3 at birth is 33 mm.
Use of Dummy Variables • measures the difference in skull length (for any age) between male and female for any specie • : means that regardless of age, a male mouse will have a skull length 3.05 mm larger than a female mouse
Use of Dummy Variables • measures the difference in skull length (for male mouse of any age) between species one and three • : means that a mouse of species 1 will have a skull length 4.9 mm smaller than a mouse of species 3, regardless of sex and age.
Use of Dummy Variables • measures the difference in skull length (male or female mice of any age) between species two and three • : means that a mouse of species 2 will have a skull length 0.22 mm smaller than a mouse of species 3, regardless of sex and age.
Use of Dummy Variables • measures the difference in skull length (for male or female mice of any age) between species one and two • (-4.9) – (-0.22) = -4.68 means the skull length for species 1 is 4.68 mm shorter than for species 2, regardless of age and sex.
Use of Dummy Variables • A model like the former assumes that sex or specie shift the skull length regression function at the origin, in a parallel fashion, for example: Male of Specie 3 Y (mm) Female of Specie 3 3.05 (age)
Use of Dummy Variables • For that reason, the dummy variables in models like the former are often called “intercept shifters”
Use of Dummy Variables • In the previous example, six different (intercepts are actually being estimated for: • Male of specie 1 • Male of specie 2 • Male of specie 3 • Female of specie 1 • Female of specie 2 • Female of specie 3
Use of Dummy Variables • However, a single slope parameter ( ) that assumes and measures the same skull length-age relation for all six sex-specie combinations is estimated