1 / 17

Empirical Analysis

Empirical Analysis. Doing and interpreting empirical work. Rules on Data You should never use data that you don’t know Know the source and how it was collected Understand the coding Know how variables are measured Know the time frame, locations and other relevant information

joella
Download Presentation

Empirical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Empirical Analysis Doing and interpreting empirical work

  2. Rules on Data • You should never use data that you don’t know • Know the source and how it was collected • Understand the coding • Know how variables are measured • Know the time frame, locations and other relevant information • You should be able to describe the data • Summary statistics • Be very clear about definitions • Understand its limitations and special features • Know why and how it addresses your research question

  3. Rules for good empirical analysis • Never use a technique you don’t understand • Plan your approach • You are looking for something like “a” causes “b” • Find a technique that leads to that conclusion; don’t just automatically run a regression • Think about how your results could provide the answer to your hypotheses. In fact, plan your approach from the hypotheses • Know whose behavior you are modeling • Understand your causation. Be careful of spurious correlation, and of bi-directional causation. • Defend the exogeneity of RHS variables.

  4. Specifics on Empirical Models • Describe what economic mechanism caused the dispersion in your right hand variables. Remember, natural experiments are rare. Do you really have one? • Understand what economic mechanism constitutes the error term. What variation in the dependent variable is not covered by your predetermined variables. • Understand why your error term is uncorrelated with RHS variables, or how you will fix it • There should be economicas well as statistical reasons. • If you use instrumental variables • Understand the difference between an instrument and a control. Should it be an additional variable not an instrument? • If it is an IV, why it is a good instrument and uncorrelated with the error

  5. Think about the specification in terms of your hypotheses • Do you need a nonlinear model? • Remember that high R2 can be bad (left shoes = b0 + b1 right shoes) • What out for estimating identities

  6. Significance • There is a distinct difference between economic significance and statistical significance. You need to understand the difference. • Statistical significance is a measure of the strength of the signal relative to background noise. • Economic significance is a measure of the importance of the finding to supporting or disproving your hypotheses. • Use statistical significance to test your null hypothesis. • Use economic significance to test your research question.

  7. Significance - Rules to Remember • Statistical significance does not indicate causality. It simply measures how precisely the degree of correlation between the variables can be measured. • Statistical significance is influenced by sample size. With a sufficiently large sample, almost any estimate can be found to be statistically significant at some level. Hence, the significance level to reject a null hypothesis should vary inversely with your sample size. This is why the 5% level is not absolute. • Statistical significance is not the same as economic significance. Size matters!

  8. Statistical Significance • Statistical (Fisherian) significance has little to do with economic significance. • What is statistical significance? What does it measure? Which of the following is the correct interpretation? • Given the data, the p-value tells us the probability that the null hypothesis is true. P(H0|D) • Given that the null hypothesis is true, the p-value tells us the probability of getting these data. P(D|H0) • Does P(H0|D)= P(D|H0)? No

  9. Example About 2% of a population has a disease. A test for the disease is accurate 95% of the time when the disease is present (called sensitivity) and will indicate the absence of the disease 97% of the time when the disease is absent (called specificity). Let H0 be that the disease is absent in the patient. Hence we have P(D|H0)<0.05. But when a test is done on a patient and we get an indication that the disease is present, we need to adjust for the fact that most people don’t have the disease, and there are false positives.

  10. Example (continued). For every 1000 tests A positive test is more likely a false positive than a true positive. Recent suggestion to do away with PSA tests in men reflects this sort of result. Think about the example in Cohen about his colleague. Does anyone recall what that example is?

  11. Pearson v. Fisher • Buchanan-Wollaston claimed in any hypothesis test, there is a large region in the distribution of the criterion for which neither the hypothesis nor its reverse can be assumed true. He focused his comments on the 2 as a goodness of fit test. • Pearson responded that the test is useful only to choose between two distributions, not to validate which distribution (hypothesis) is correct. • Fisher responded that Buchanan was correct; rejection of the null is not equivalent to acceptance of the alternative. Indeed, it is the source of type II errors.

  12. Pearson v. Fisher (continued) • Pearson responded again, arguing that the conclusions and significance value used should be chosen with regard to the problem at hand. The test is a measure of the adequacy of some model to explain the observed data (Power of the test). Anything beyond that is inferential, not statistical. • Both agree that the failure to reject the null hypothesis does not mean it is true. The only hypothesis proved true by a statistical test is the negation of a hypothesis that the observed sample has zero possibility. Think about the colleague example in the Smith paper.

  13. Pearson v. Fisher (continued) • Fisher tells us that the value of significance tests is only in that they tell us what to ignore.This is why null results (that the parameter is not significantly different from zero) are useful. • Pearson tells us that scientists do not seek the truth, only ways to summarize the data; that is what significance testing is doing. • What are we doing with econometrics? (discuss from class) • Conclusion; statistical inference and economic inference are different things.

  14. McCloskey and McCloskey and Ziliak • What is their point? • Fisherian significance is meaningless in almost all cases of economic application, yet economists often report statistical significance as important. • Size matters! • A well-measured but trivial economic effect may be neglected even if statistically significant. • But an economically large but poorly measured effect should not be rejected because the signal is noisy.

  15. Blank’s study in AER (1991) about acceptance rates and blind refereeing.. • She found that the acceptance rates of papers written by women were lower than those written by men when refereeing was not blind. But the difference was statistically insignificant (at p<0.05) • If the difference was statistically significant with the sample, we know it would be so with a larger sample too. • But with it not statistically significant at an acceptable level, it tells us nothing about truth.

  16. Example 2 Suppose you measure the equation Investment = a + b1*interest rate + b2*tax rates + e If tax rates and b are large, then tax rates are economically important. But if there is little variation in tax rates, you would find b2 not different from 0 at conventional levels of statistical significance.

  17. So how big is big? There is no real answer, but if you do empirical analysis, keep the following rules in mind. • Statistical significance is about sample size. • The level of statistical significance “used” should fit your data, problem and importance of the variable to your hypotheses and conclusion, and to the context of your research question. • Your assumptions matter, but you can’t just assume something to be true. For example, you can’t just assume a variable is exogenous. You need to argue for it, and sometimes test it.

More Related