230 likes | 462 Views
Tests of Hypothesis in Linear Regression Models. Hendrik Wolff – hgwolff@UW.edu. Simple Tests in Linear Regression MOdels. t-Test F-Test Autocorrelation Test Heteroskedasticity Test Chow Test for Structural Breaks. Student T-Test in the Linear Regression Model .
E N D
Tests of Hypothesis in Linear Regression Models Hendrik Wolff – hgwolff@UW.edu
Simple Tests in Linear Regression MOdels • t-Test • F-Test • Autocorrelation Test • Heteroskedasticity Test • Chow Test for Structural Breaks
Student T-Test in the Linear Regression Model William Gosset, employee of Guinness, developed the t- distribution and published it under the pseudonym ‘student’ in Biometrika in 1908.
Student T-Test in the Linear Regression Model • Is result of an experiment • “Random” or • “Statistical Significant”? William Gosset, employee of Guinness, developed the t- distribution and published it under the pseudonym ‘student’ in Biometrika in 1908.
What is “Statistical Significant?” Google: statistical significant Context of Linear Regression Model How do we know that is “Statistically Significant”? Procedure: Specify a “Null “Hypothesis and perform a t-Test!
STATA Example • Are Black American Woman Discriminated in the U.S. Labor Market? • STATA provides wage data for year 1988 “for free”. Simply type “sysusenlsw88” • The 'National Longitudinal Surveys of Young Women and Mature Women' (nlsw) is a dataset of 2229 individuals and has rich information on • - hourly wage • - ethnicity • - grade (years of education) • - tenure (years of job experience) • Regression Equation: Whereby the dummy African is defined as: African = 1 if person is black (African American) African = 0 if person is White
STATA Example • Are Black American Woman Discriminated in the U.S. Labor Market? • STATA provides wage data for year 1988 “for free”. Simply type “sysusenlsw88” • The 'National Longitudinal Surveys of Young Women and Mature Women' (nlsw) is a dataset of 2229 individuals and has rich information on • - hourly wage • - ethnicity • - grade (years of education) • - tenure (years of job experience) • Regression Equation: Whereby the dummy African is defined as: African = 1 if person is black (African American) African = 0 if person is White
* Do File for Lecture on Testing for 'racial discrimination' using t-test. • * The purpose is to estimate the regression equation • * hourly wage = alpha + beta1*"years of education" + beta2*'years of job experience' + beta3*black + epsilon • * and then test for racial discrimination • * Background: The 'National Longitudinal Surveys of Young Women and Mature Women' (NLSW) comprises two separate surveys. The Young Women's survey includes women who were ages 14–24 when first interviewed in 1968. The Mature Women's survey includes women who were ages 30–44 when first interviewed in 1967. These surveys were discontinued in 2003. • * Here we use the (NLSW) of the year 1988 for the 'mature' woman. • sysusenlsw88 • * browse the dataset • browse • * Prepare the 'black' variable • gen black = 0 • replace black = 1 if race == 2 • * regression • reg wage grade tenure black • * Exercises: • * What is the regression equation? • * If I go one more year to school, by how much will my hourly wage increase? • * A 'white' women of age 40 with, with 15 years of schooling and three years of job tenure: What hourly wage is she likely to earn? • * If this women were black, what would be her hourly wage? • * Which of the parameters is statistically significant? • * How is the t-test for beta3 computed? (show your calculations). • * For the two sided t-test on beta3: what is the H_0 and what is the H_A? • * In the above regression, does STATA report the one-sided or the two sided t-test? • * Provide a numerical example of an one sided t-test for beta3. What is the corresponding p-value? How would you define H_0 vs. H_A?
T-Test Intuition Wage grade (years of schooling)
T-Test Intuition Wage grade (years of schooling)
T-Test Intuition Wage = -0.74 with std.err of 0.26: Is this statistically different from zero? grade (years of schooling)
Intuition • Under the “Null Hypothesis” of NO discrimination is distributed as N(0,δβ4) • Our point estimate = -0.74 is a random draw from N(0, δβ4) • Plot N(0, δβ4) and mark -0.74: Is this ‘significantly’ different from 0? See blackboard. • To avoid having different test statistic for each parameter estimate: let’s normalize • Standardize distribution N(0, δβ4) to N(0,1) by dividing β4 by SQRT( δβ4) • Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces additional noise, which produces fatter tails to the normal distribution • From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K) • δβ4 is a function of a Chi-square distribution with N-K degrees of freedom
T-Test Intuition • Trick: To avoid having different test statistic for each parameter estimate: let’s standardize N(0, δβ4) to N(0,1) by dividing β4 by SQRT( δβ4) . • In Large samples critical values in N(0,1): • 5% Critical Value is 1.96 • 1% Critical Value is 2.58 In small samples: Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces additional noise, which produces fatter tails to the normal distribution From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K) δβ4 is a function of a Chi-square distribution with N-K degrees of freedom
What is significant enough? “Statisticians are people, whose aim in life is to be wrong 5% of the time” (Kempthorne and Doerfler, 1969)
T-Test • In small samples: Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces additional noise, which produces fatter tails to the normal distribution • From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K) • δβ4 is a function of a Chi-square distribution with N-K degrees of freedom
Two Sided Test vs. One Sided Test Repeat: Two sided test is today the ‘standard’ and is more conservative: • H0: , • HA: • Critical value for 5% significance level is +/-1.96 (asymptotically) One sided test, however, absolutely makes sense too: • H0: , • HA: • Then critical value is +1.64 • Generally, we can also formalize any Null with = b0, and test for this distance as • t=[- b0]/(s.e()
Summary of Terms: by now you should be familiar with the following terms: • Null Hypothesis • Alternative Hypothesis • critical values • T value • P value • Two sided t-test • One sided t-test • Statistical Significant
Homework: using NLSW88 and Racial Discrimination Equation discussed in Class • Run the STATA do file. What is the regression equation? • Define the Null Hypothesis and the Alternative Hypothesis for whether or not Black women are discriminated. * If I go one more year to school, by how much will my hourly wage increase? * A 'white' women of age 40 with, with 15 years of schooling and 3 years of job tenure: What hourly wage is she likely to earn? * If this women were not white but black, what would be her hourly wage? * Which of the estimated parameters [beta1, beta2, beta3, beta4] is statistically significant? * How is the t-test for beta4 computed? (show your calculations!). * For the two sided t-test on beta4: what is the H_0 and what is the H_A? * In the above regression, does STATA report the one-sided or the two sided t-test? * Provide a numerical example of an one sided t-test for beta3. What is the corresponding p-value? How would you define H_0 vs. H_A for such a one sided test?