1.7k likes | 1.76k Views
ADVANCE ECONOMETRICS. LECTURE SLIDES, 2012 SPRING. The distinction between qualitative and quantitative data.
E N D
ADVANCE ECONOMETRICS LECTURE SLIDES, 2012 SPRING
The distinction between qualitative and quantitative data • The microeconomist’s data on sales will have a number corresponding to each firm surveyed (e.g. last month’s sales in the first company surveyed were £20,000). This is referred to as quantitative data. • The labor economist, when asking whether or not each surveyed employee belongs to a union, receives either a Yes or a No answer. These answers are referred to as qualitative data. Such data arise often in economics when choices are involved (e.g. the choice to buy or not buy a product, to take public transport or a private car, to join or not to join a club).
Economists will usually convert these qualitative answers into numeric data. For instance, the labor economist might set Yes = 1 and No = 0. Hence, Y1 = 1 means that the first individual surveyed does belong to a union, Y2 = 0 means that the second individual does not. When variables can take on only the values 0 or 1, they are referred to as dummy (or binary) variables.
What isDescriptive Research? • Involves gathering data that describe events and then organizes, tabulates, depicts, and describes the data. • Uses description as a tool to organize data into patterns that emerge during analysis. • Often uses visual aids such as graphs and charts to aid the reader
Descriptive Researchtakes a “what is” approach • What is the best way to provide access to computer equipment in schools? • Do teachers hold favorable attitudes toward using computers in schools? • What have been the reactions of school administrators to technological innovations in teaching?
Descriptive Research • We will want to see if a value or sample comes from a known population. That is, if I were to give a new cancer treatment to a group of patients, I would want to know if their survival rate, for example, was different than the survival rate of those who do not receive the new treatment. What we are testing then is whether the sample patients who receive the new treatment come from the population we already know about (cancer patients without the treatment).
Economic and Econometric Models • The model of the economic behavior that has been derived from a theory is the economic model • After the unknown parameters have been estimated by using economic data foe the variables and by using an appropriate econometric estimation method, one has obtained an econometric model. • It is common to use Greek characters to denote the parameters. • C=f(Y) economic model • C=β1 + β2 Y , C=15.4+0.81Y econometric model
Economic data • Time series data, historical data • Cross sectional data • Panel data Variables of an economic model Dependent variable Independet variable, explanatory variable, control variable The nature of economic variables can be endogeneous , exogeneous or lagged dependent
A variable is an endogenous variable if the variable is determined in the model. Therefore dependant variable is always an endogenous variable • The exogenous variables are determined outside of the model. • Lagged dependent or lagged endogenous variables are predetermined in the model • The model is not necessary a model of only one equation if more equations have been specified to determine other endogenous variables of the system then the system is called a simultaneous equation model (SEM) • If the number of equation is identical to the number of endogenous variables, then that system of equation is called complete. A complete model can be solved for the endogenous variables.
Static model • A change of an explanatory variable in period t is fully reflected in the dependent variable in the same period t. • Dynamic model • The equation is called a dynamic model when lagged variables have been specified. • Structural equations • The equations of the economic model as specified in the economic theory, are called the structural equations. • Reduced form model • A complete SEM can be solved for the endogenous variables. The solution is called the reduced form model. The reduced form wil be used to stimulate a policy or to compute forecasts for the endogenous variables.
Parameters and elasticities, • The parameters are estimated by an estimator and the result is called an estimate • The log transformation is a popular transformation in econometric research, because it removes non-linearities to some extent.1 • Stochastic term • A disturbance term will be included at the right hand side of the equation and is not observed • At the right hand side of the equation, two parts of the specification ; the systematic part which concerns the specification of variables based on the economic theory; and the non-systematic part which is remaining random non-systematic variation.2
Applied quantitative economic research • The deterministic assumptions; • It concerns the specification of the economic model, which is formulation of the null hypothesis about the relationship between the economic variables of interest. The basic specification of the model originates from the economic theory. An important decision is made about the size of the model, whether one or more equation have to be specified. • the choice of which variables have to be included in the model stems from the economic theory • The availability and frequency of the data can influence the assumptions that have to be made • The mathematical form of the model has to be determined. Linear or nonlinear. Linear is more convenient to analyze.
Evaluation of the estimation results, • The evaluation concerns the verification of the validity and reliability of all the assumptions that have been made • A first evaluation is obtained by using common sense and economic knowledge. This is followed by testing the stochastic assumptions by using a normality test, autocorrelation test, heteroscedasticity tests, etc. looking at a plot of the residuals can be very informative about cycles or outliers • If the stochastic assumptions have not been rejected , the deterministic assumptions can be tested by using statistical tests to test restrictions on parameters. The t-test and F-test can be performed. The coefficient of determination R2 can be interpreted.
Hypothesis Testing Purpose: make inferences about a population parameter by analyzing differences between observed sample statistics and the results one expects to obtain if some underlying assumption is true. • Null hypothesis: • Alternative hypothesis: If the null hypothesis is rejected then the alternative hypothesis is accepted.
Hypothesis Testing A sample of 50 files from a file system is selected. The sample mean is 12.3Kbytes. The standard deviation is known to be 0.5 Kbytes. Confidence: 0.95 Critical value =NORMINV(0.05,0,1)= -1.645. Region of non-rejection: Z ≥ -1.645. So, do not reject Ho. (Z exceeds critical value)
Steps in Hypothesis Testing 1. State the null and alternative hypothesis. 2. Choose the level of significance a. 3. Choose the sample size n. Larger samples allow us to detect even small differences between sample statistics and true population parameters. For a given a, increasing n decreases b. 4. Choose the appropriate statistical technique and test statistic to use (Z or t).
5. Determine the critical values that divide the regions of acceptance and nonacceptance. 6. Collect the data and compute the sample mean and the appropriate test statistic (e.g., Z). 7. If the test statistic falls in the non-reject region, Ho cannot be rejected. Else Ho is rejected.
Z test versus t test • 1.Z-test is a statistical hypothesis test that follows a normal distribution while T-test follows a Student’s T-distribution.2. A T-test is appropriate when you are handling small samples (n < 30) while a Z-test is appropriate when you are handling moderate to large samples (n > 30).3. T-test is more adaptable than Z-test since Z-test will often require certain conditions to be reliable. Additionally, T-test has many methods that will suit any need.4. T-tests are more commonly used than Z-tests.5. Z-tests are preferred than T-tests when standard deviations are known.
One tail versus two tail • we were only looking at one “tail” of the distribution at a time (either on the positive side above the mean or the negative side below the mean). With two-tail tests we will look for unlikely events on both sides of the mean (above and below) at the same time. • So, we have learned four critical values. 1-tail 2-tail • α = .05 1.64 1.96, -1.96 • α = .01 2.33 2.58/-2.58 • Notice that you have two critical values for a 2-tail test, both positive and negative. You will have only one critical value for a one-tail test (which could be negative).
Which factors affect the accuracy of the estimate • 1. Having more data points improves accuracy of estimation. • 2. Having smaller errors improves accuracy of estimation. Equivalently, if the SSR is small or the variance of the errors is small, the accuracy of the estimation will be improved. • 3. Having a larger spread of values (i.e. a larger variance) of the explanatory variable (X) improves accuracy of estimation.
Calculating a confidence interval for β • If the confidence interval is small, it indicates accuracy. Conversely, a large confidence interval indicates great uncertainty over β’s true value. Confidence interval for β is or • sb is the standard deviation of • Large values of sb will imply large uncertainty
Hypothesis testing involving R2: the F-statistic • R2 is a measure of how well the regression line fits the data or, equivalently, of the proportion of the variability in Y that can be explained by X. If R2 = 0 then X does not have any explanatory power for Y. The test of the hypothesis R2 = 0 can therefore be interpreted as a test of whether the regression explains anything at all • we use the P-value to decide what is “large” and what is “small” (i.e. whether R2 is significantly different from zero or not)
F-Test • Usage of the F-test • We use the F-test to evaluate hypotheses that involved multiple parameters. Let’s use a simple setup: • Y = β0 + β1X1 + β2X2 + β3X3 + εi
F-Test • For example, if we wanted to know how economic policy affects economic growth, we may include several policy instruments (balanced budgets, inflation, trade-openness, &c) and see if all of those policies are jointly significant. After all, our theories rarely tell us which variable is important, but rather a broad category of variables.
F-statistic • The test is performed according to the following strategy: • 1. If Significance F is less than 5% (i.e. 0.05), we conclude R2 ≠ 0. • 2. If Significance F is greater than 5% (i.e. 0.05), we conclude R2 = 0.
Multiple regression model • The model is interpreted to describe the conditional expected wage of an individual given his gender, years of schooling and experience; • The coefficient β2 for malei measures the difference in expected wage between a male and a female with the same schooling and experience. • The coefficient β3 for schooli gives the expected wage difference between two individuals with the same experience and gender where one has additional year of schooling. • The coefficients in a multiple regression model can only be interpreted under a ceteris paribus condition.
Estimation by OLS gives the following results; • Dependent variable: wage Variable Estimate Standard Error t-ratio Constant -3.38 0.4650 -7.2692 Male 1.3444 0.1077 12.4853 School 0.6388 0.0328 19.4780 Exper 0.1248 0.0238 5.2530 s = 3.0462 R2 = 0.1326 R2 = 0.13 F = 167.63 The coefficient for malei suggest that if we compare an arbitrary male and female with the same years of schooling and experience, the expected wage differential is 1.34$ with standard error 0.1077 which is statistically highly significant. The null hypothesis that schooling has no effect on a persons wage, given gender and experience, can be tested using the t-test with a test statistic of 19.48. The estimated wage increase from one additional year of schooling, keeping years of experience fixed is $0.64 The joint hypothesis that all three partial slope coefficients are zero, that is wages are not affected by gender, schooling or experience has to be rejected as well.
R2 is 0.1326 which means that the model is able explain 13.3% of the within sample variation in wages. • A joint test on the hypothesis that the two variables, schooling and experience both have zero coefficient by performing the F test. • With a 5% critical value of 3.0, the null hypothesis is obviously rejected. We can thus conclude that the model includes gender schooling and experience performs significantly better than model which only includes gender.
Hypothesis testing • When a hypothesis is statistically tested two types of errors can be made. The first one is that we reject the null hypothesis while it is actually true- type I error. The second one is that the null hypothesis is not rejected while the alternative is true- type II error • The probability of a type I error is directly controlled by the researcher through his choice of the significance level α. When a test is performed at the 5% level, the probability of rejecting the null hypothesis while it is true is 5%. • The probability of a type II error depends upon the true parameter values. If the truth deviates much from the stated null hypothesis, the probability of such an error will be relatively small, while it will be quite large if the null hypothesis is close to the truth. • The probability of rejecting the null hypothesis when it is false, is known as the power of the test. It indicates how powerful a test is in finding deviations from the null hypothesis
Ordinary Least Squares (OLS) • Objective of OLS Minimize the sum of squared residuals: • where • Remember that OLS is not the only possible estimator of the βs. • But OLS is the best estimator under certain assumptions…
CAPM • Expected returns on individual assets are linearly related to the expected return on the market portfolio. • regression without an intercept CAPM regression without intercept Dependent variable: excess industry portfolio returns Industry food durables construction Excess market return 0.790 1.113 1.156 (0.028) (0.029) (0.025) Adj.R2 0.601 0.741 0.804 S 2.902 2.959 2.570
CAPM regression with intercept Dependent variable: excess industry portfolio returns Industry food durables construction Constant 0.339 0.064 -0.053 (1.128) (0.131) (0.114) Excess market return 0.783 1.111 1.157 (0.028) (0.029) (0.025) Adj.R2 0.598 0.739 0.803 S 2.885 2.961 2.572
CAPM regression with intercept and January dummy Dependent variable: excess industry portfolio returns Industry food durables construction Constant 0.417 0.069 -0.094 (0.133) (0.137) (0.118) January dummy -0.956 -0.063 0.498 (0.456) (0.473) (0.411) Excess market return 0.788 1.112 1.155 (0.028) (0.029) (0.025) Adj.R2 0.601 0.739 0.804 S 2.876 2.964 2.571
OLS Assumptions • But OLS is the best estimator under certain assumptions… • Regression is linear in parameters • 2. Error term has zero population mean • 3. Error term is not correlated with X’s (exogeneity) • 4. No serial correlation • 5. No heteroskedasticity • 6. No perfect multicollinearity and we usually add: • 7. Error term is normally distributed
Exogeneity • All explanatory variables are uncorrelated with the error term • E(εi|X1i,X2i,…, XKi,)=0 • Explanatory variables are determined outside of the model (They are exogenous) • What happens if assumption 3 is violated? • Suppose we have the model, • Yi =β0+ β1Xi+εi • Suppose Xi and εi are positively correlated When Xi is large, εi tends to be large as well.
Exogeneity • We estimate the relationship using the following model: • salesi= β0+β1pricei+εi • What’s the problem?
Assumption 3: Exogeneity • What’s the problem? • What else determines sales of hamburgers? • How would you decide between buying a burger at McDonald’s ($0.89) or a burger at TGI Fridays ($9.99)? • Quality differs • salesi= β0+β1pricei+εi quality isn’t an X variable even though it should be. • It becomes part of εi
Assumption 3: Exogeneity • What’s the problem? • But price and quality are highly positively correlated • Therefore x and ε are also positively correlated. • This means that the estimate of β1will be too high • This is called “Omitted Variables Bias” (More in Chapter 6)
Serial Correlation • Serial Correlation: The error terms across observations are correlated with each other • i.e. ε1 is correlated with ε2, etc. • This is most important in time series • If errors are serially correlated, an increase in the error term in one time period affects the error term in the next • Homoskedasticity: The error has a constant variance • This is what we want…as opposed to • Heteroskedasticity:The variance of the error depends on the values of Xs.
Perfect Multicollinearity • Two variables are perfectly collinear if one can be determined perfectly from the other (i.e. if you know the value of x, you can always find the value of z). • Example: If we regress income on age, and include both age in months and age in years.
Adjusted/Corrected R2 • R2 = SSR/SST . As before, R2 measures the proportion of the sum of squares of deviations of Y that can be explained by the relationship we have fitted using the explanatory variables. • Note that adding regressors can never cause R2 to decrease, even if the regressors) do not seem to have a significant effect on the response of Y . • Adjusted (sometimes called \corrected") R2 takes into account the number of regressors included in the model; in effect, it penalizes us for adding in regressors that don't \contribute their part" to explaining the response variable. • Adjusted R2 is given by the following, where k is the number of regressors • Adjusted
Interpreting Regression Results • β measures the expected change in Yi if Xi changes with one unit but all other variables in Xi do not change • In a multiple regression model single coefficients can only be interpreted under ceteris paribus conditions. • It is not possible to interpret a single coefficient in a regression model without knowing what the other variables in the model are. • The other variables in the model are called control variables • Economists are interested in elasticities rather than marginal effect. An elasticity measures the relative change in the dependant variable due to a relative change in one of the Xi variables.
Linear model implies that elasticities are nonconstant and vary with Xi, while the loglinear model imposes constant elasticities. • Explaining log Yi rather than Yi may help reducing heteroscedasticity problems. • If Xi is a dummy variable (or another variable that may take negative values) we cannot take logarithm and we include the original variable in the model. Thus we estimate, • It is possible to include some explanatory variables in logs and some in levels. The interpretation of a coefficient β is the relative change in Yi due to absolute change of one unit in Xi. If Xi is a dummy variable for males β is the (ceteris paribus) relative wage differential between men and women.
Selecting regressors • What happens when a relevant variable is excluded from the model and what happens when an irrelevant variable is included in the model. • Omitted variable bias • Including irrelevant variables in your model, even though they have a zero coefficient, will typically increase the variance of the estimators for the other model parameters. While including too few variables has the danger of biased estimates. • To find potentially relevant variables we can use economic theory. • General-to-spesific modelling approach – LSE methodology. This approach starts by estimating a general unrestricted model (GUM), which subsequently reduced in size and complexity by testing restrictions.
In presenting your estimation results, it is not a sin to have insignificant variables included in your specification • Be careful including many variables in your model can cause multicollinearity, in the end almost none of the variables appears individually significant • Another way to select a set of regressors, R2 measures the proportion of the sample variation in Yi , that is explained by variation in Xi. • If we were to extend the model by including Zi in the set of regressors the explained variation would never decrease, so that R2 will never decrease we include relevant additional variables in the model. • -not optimal because with too many variables we will not be able to say very much about the models coefficients. Becaause R2 does not punish the inclusion of many variables. • Better to use two of them. Trade-off between goodness of fir and the number of regressors employed in the model. Use adjusted R2
Adjusted R2 provides a trade-off between goodness of fit as measured by and simplicity of the model as measured by number of parameters • Akaikes Information Criterion • Schwarz Bayesian Information Criterion • Models with lower AIC and BIC are preferred.