900 likes | 1.05k Views
?. I. Qualitative (or Dummy) Independent Variables. “Binary” vs. “Dummy”. I was taught to call these variables “binary variables” I now believe that “binary variables” is a better (more descriptive) name So, let’s call them “binary variables” from now on.
E N D
? I. Qualitative (or Dummy) Independent Variables
“Binary” vs. “Dummy” • I was taught to call these variables “binary variables” • I now believe that “binary variables” is a better (more descriptive) name • So, let’s call them “binary variables” from now on
III. Introduction • A. New type of variable • 1. Past: used quantitative variables (numerically measurable); continuous • 2. Now: variables that take small number of values; discrete • a) Gender • b) Market size • c) Region of country • d) Marital status (married vs. not), etc
Introduction (cont.) • B. Used as IV in this section • C. Used as DV later in course
Introduction (cont.) • Institute of Management Accounts (IMA) publishes an annual Salary Guide • In Strategic Finance magazine • sfmag@imanet.org • Annual survey of members • “…based on a regression equation derived from survey results.”
IMA Salary Guide (cont.) SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE • TOP=1 if top level mgmt, 0 if not • SENIOR=1 if senior level mgmt , 0 if not • ENTRY=1 if entry level , 0 if not • ADVDEGREE=1 if advanced degree , 0 if not • NODEGREE=1 if no degree , 0 if not • PROFCERT=1 if hold professional certification , 0 if not • MALE=1 if male , 0 if not • YEARS=years of experience
IMA Salary Guide (cont.) • Average IMA member (1999) • Male • 14.5 years experience • Professional certification • Salary = $66,356 • Figure obtained from substituting values into regression equation
Are Wins Worth More in a Large Market? See regression output for binary variables as IVs. (note)
Introduction (cont.) • D. Example #1 • 1. Y = + X2 + • 2. Y: social program expenditures per state • 3. X2: state’s total revenue • 4. Suppose states’ legislatures controlled by Democrats spend more from same revenue than those controlled by Republicans • 5. How account for this in model? • 6. What’s the categorical variable?
Introduction (cont.) • E. Example #2 • 1. Y = + X2 + • 2. Y: coach’s earnings • 3. X2: coach’s experience • 4. Suppose women earn less than men with equal experience (& other characteristics) • 5. How account for this in model? • 6. What’s the categorical variable?
Introduction (cont.) • F. Example #3 • 1. Y = + X2 + • 2. Y: sales of swimsuits in Minnesota • 3. X2: Minnesota’s population • 4. Suppose sales peak in warm months • 5. How account for this in model? • 6. What’s the categorical variable?
Introduction (cont.) • G. Example #4 • 1. Y = + X2 + • 2. Y: profits of NBA teams • 3. X2: wins • 4. Suppose teams in large markets make more profit on their wins than teams in other markets • 5. How account for this in model? • 6. What’s the categorical variable?
Introduction (cont.) • G. Will use Binary (or Dummy) Independent Variables • 1. Create a special variable that takes a value of • a) if the unit of observation falls into one category • b) if the unit falls into the other category 1 0
Why Called “Dummy” Variables? (Multiple Choice Question) A. A MAN NAMED “ALFRED DUMMY” INVENTED THEM C. THEY REPRESENT CATEGORICAL VARIABLES B. ANYONE WHO USES THEM IS. . . A DUMMY ?
Introduction (cont.) • 2. Example • a) GENDER = 1 for all females in the sample • b) GENDER = 0 for all males
Introduction (cont.) • c. you pick which category gives a value of 1 and which category gives a value of 0 • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males OBS # GENDER 1 male 2 male 3 female 4 male 5 female 6 female
Introduction (cont.) • c. you pick which category gives a value of 1 and which category gives a value of 0 • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males OBS # GENDER 1 male 0 2 male 3 female 4 male 5 female 6 female
Introduction (cont.) • c. you pick which category gives a value of 1 and which category gives a value of 0 • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males OBS # GENDER 1 male 0 2 male 0 3 female 1 4 male 5 female 6 female
Introduction (cont.) • c. you pick which category gives a value of 1 and which category gives a value of 0 • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males OBS # GENDER 1 male 0 2 male 0 3 female 1 4 male 0 5 female 1 6 female 1
Only Use 0 & 1 Values • Only use 0 & 1 values • Never use 1, 2, 3,… (for example) • Why not? • 2 is how many times bigger than 1? (2/1) • 3 is how many times bigger than 2? (3/2) • 1 is how many times bigger than 0? (1/0)
IV. Binary Variables Change Intercept-Two Categories • A. Intercept term changes according to the two values of the one binary variable • intercept is one value when D = 0 • intercept is different value when D = 1 • B. Use only ONE binary variable per variable with two categories
Binary Variables Change Intercept-Two Categories (cont.) • C. In model Y = + X2 + • 1. for same value of X, Y (in group #1) not = Y(in group #2) Male: when X2 = 16, Y = 34 Female: when X2 = 16, Y = 29 • 2. Since X is the same value for both groups, • a) either or must be different to cause Y (in group #1) not = Y(in group #2)
Binary Variables Change Intercept-Two Categories (cont.) • D. Different cases • 1. differs between groups OR • 2. differs between groups OR • 3. both and differ between groups Y = + X2 +
Binary Variables Change Intercept-Two Categories (cont.) • E. In the model Y = + X2 + • 1. Y: profits per NBA team ($1,000,000s) • 2. X2: wins per season • 3. Suppose teams in large markets make more profit on same number of wins than teams in other markets
Binary Variables Change Intercept-Two Categories (cont.) • 4. How account for this in model? • 5. D is the binary variable • a) D = 1 if the team is in a large market • b) D = 0 if the team is in a mid-sized or small market
Binary Variables Change Intercept-Two Categories (cont.) • D = 1 if the team is in a large market • D = 0 if the team is not • = 0 + 1D • Y = + X2 + • Y = 0 + 1D + X2 +
Students • Write Y = 0 + 1D + X2 + for two cases: • mid or small market (D = 0) • large market (D = 1)
Binary Variables Change Intercept-Two Categories (cont.) • 7. Y = 0 + 1D + X2 + • 8. mid/small: Y = 0 + X2 + (D=0) • 9. large: Y = (0 + 1) + X2 + (D=1) • 10. What differs between 2 models?
Changing Intercept Profits per team Large market: Y = (0 + 1)+ X2 + Mid/small market: Y = 0 + X2 + 1 0 Wins per team (assuming 1 > 0)
1 : 3 Equivalent Meanings • 13. 1 shows change in intercept relative to control group • 14. 1 shows change in intercept due to change in market size • 15. 1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets
Changing Intercept Profits per team Large market: (LA Clippers) Y = (0 + 1)+ X2 + Mid/small market (SD Clippers) Y = 0 + X2 + $16.5M What’s value of 1? $0.5M 1 0 Wins per team 50 1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets
Binary Variables Change Intercept-Two Categories (cont.) • 16. Comparison group or control group • a) Group for which binary variable = 0 • 17. Who decides which group is control group? • a) You do • b) It doesn’t matter statistically • c) Remember which group is control when interpret results
Binary Variables Change Intercept-Two Categories (cont.) • 18. Hypothesis Test (Y = 0 + 1D + X2 + ) • a) H0: no difference in Y (for same X) between markets OR • b) H0: 1 = 0 Both: Y = 0 + X2 + • c) HA: is difference in Y (for same X) between markets OR • d) HA: 1 0 Large: Y = (0 + 1)+ X2 + Mid/small: Y = 0 + X2 + • e) What test statistic use?
Binary Variables Change Intercept-Two Categories (cont.) • F. Example See regression output for binary variables as IVs – case #1A. (note)
Binary Variables Change Intercept-Two Categories (cont.) • How interpret p-value on LARGE in model? • Interpret coefficient on LARGE. (see note p.) • a) "Between 2 teams with same number of wins, the one in the large market is expected to earn $ ??? more (or less?) ” PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE
Changing Intercept PROFIT PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE LARGE market 0.282 MID/SMALL market 16.524 WINS -8.340
Coefficient Interpretation Exercise • SEE DRAWING • Questions repeated on next three slides • Q1: Interpret the number –8.339 • Q2: Interpret the number 0.282 • Q3: Interpret the number 16.524
Changing Intercept PROFIT PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE Q1: Interpret the number -8.340 LARGE market 0.282 MID/SMALL market 16.524 WINS -8.340
Changing Intercept PROFIT PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE Q2: Interpret the number 0.282 LARGE market 0.282 MID/SMALL market 16.524 WINS -8.340
Changing Intercept PROFIT PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE Q3: Interpret the number 16.524 LARGE market 0.282 MID/SMALL market 16.524 WINS -8.340
Review • F. Example • A. PRICE = 1 + 2SQFT + • IGNORE OTHER IVs for this example • 2. Add POOL to model • a) POOL = 1 if house has pool • b) POOL = 0 otherwise
Review (cont.) A. PRICE = 1 + 2SQFT + B. PRICE = 1 + 2SQFT + 5POOL + • ESTIMATE MODEL B
Review (cont.) Variable Model B CONSTANT 22.673 (0.09) SQFT 0.1444 (0.001) POOL 52.790 (0.03) Adj. R2 0. 890
Review (cont.) • How interpret p-value on POOL in Model B? • Interpret coefficient on POOL. (see note p.) • a) "Between 2 houses of same size, the one with a pool is expected to sell for $ more (or less?) ” 52,790
Review (cont.) • Estimated model: PRICE = 22.673 + 0.1444SQFT +52.79POOL No Pool: (POOL=0) PRICE = 22.673 + 0.1444SQFT With Pool: (POOL=1) PRICE = 22.673 + 0.1444SQFT +52.79*1 = (22.673+ 52.79 ) + 0.1444SQFT
Review (cont.) • What’s price for 1000 sq. ft house . . . • And NO pool? (notes page) • WITH pool?
Review (cont.) PRICE Model F: with POOL Model F: no POOL 0.1444 52.790 22.673 SQFT Q2: Interpret the number 52.790
Exercise Binary Variables #3