450 likes | 589 Views
ECO 420 Advanced Empirical Methods. Instructor. Jing Li Second year at Miami Taught undergraduate and graduate econometrics before Married with two kids. Books. Required: Introductory Econometrics, a Modern Approach by Jeffrey M. Wooldridge
E N D
Instructor • Jing Li • Second year at Miami • Taught undergraduate and graduate econometrics before • Married with two kids
Books • Required: Introductory Econometrics, a Modern Approach by Jeffrey M. Wooldridge • http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=Introductory+Econometrics%2C+a+Modern+Approach+by+Jeffrey+M.+Wooldridge • Recommended: Mostly Harmless Econometrics: An Empiricist's Companion
Webpage • http://fsb.muohio.edu/lij14/ • Notes, data and codes will be posted
Critical Thinking • Example: someone tries to show Canadians like girls more than boys • How? He shows that the number of baby girls born in 2011 is greater than boys. Ok? • Next, he shows that the number of girls adopted in 2011 is greater than boys. Ok? • What do you think?
Critical Thinking • A president of a private high school wants to prove that private school is worth the money. • How? He shows that the more students of his schools go to Ivy League than the best public school in town. • What do you think?
Causality • How to interpret regression? • Regression can show association • Under stricter assumption (ceteris paribus), regression can prove causality • Econometrics focuses on causality
Two Examples • Does wearing safety belt cause fewer deaths? • Does the great recession in 2007-2009 cause fewer marriages?
Ceteris Paribus • It means all other things being equal • Ideally, causality can be proved if Ceteris Paribus holds • The president of the private school is wrong because the family backgrounds of students in private and public schools are not equal, so Ceteris Paribus fails.
Key Issues • How to design experiment to ensure ceteris paribus? • How to find natural experiment? • How to deal with non-experimental data?
STATA • Available at FSB computer lab Room 2037 • Do file puts commands together. Click File---Do, and then choose the do file. • Log file puts results (no graphs) together. The text format is recommended. You can use any text editor to open the file.
A Typical Do File • Clear • Capture log close • Log using logfilename.txt, text replace • Insheet using datafilename.csv, clear • … • Log close
Most Important Commands • Insheet: read data into memory • List: display data on screen • Sort: sort data • Des: summarize quantitative variable • Tab: summarize qualitative variable • Reg: run regression • Gen: generate new variable • Egen: generate fancy thing
Tips for Reading Data • Double check the following items • The variable names (letter and number only) • The missing values (NA, space, a dot) • Dollar sign, comma, etc
Tips for Summarizing Data • Pay attention to min, max, obs • Median (the 50% percentile) is more robust to extreme values than mean • Skewness is zero for symmetric distribution • Kurtosis is three for normal distribution • Variance and standard deviation measure the dispersion of the distribution
Tips for Plotting • Help twoway • A scatter plot only shows association, not causality • Pay attention to scale • It is not uncommon to plot the log of data
Tips for Running Regression • In general, regression shows association, not causality • Pay attention to following: • Outliers • Structural Change • Omitted Variables
Review • Mean is good guess by minimizing errors • Conditional mean is better guess than unconditional mean • Conditional mean is random variable • Law of iterated expectation
STATA • Unconditional Mean: sum y • Conditional Mean: by x: sum y You need to sort data by x first
Compare Means • Example: Are the exam scores in section B greater than section C? • If there are two groups and x is the group indicator, using ttest y, by(x)
Regression with Dummy Variables • Alternatively, we can run regression using dummies to compare means. This approach is better than ttest gen d1 = (x == 1), gen d2 = (x == 2), … reg y d1 d2…
Question (Optional, Just for fun) • Suppose we draw . After we know we draw . Find • Solve this problem using Law of Iterated Expectation.
Simulation • help uniform • set obs10000 • gen z = uniform() • gen y = z+(1-z)*uniform() • sum y
Mean and Sample Mean • We use sample mean (estimator) to estimate the population mean (parameter) • The sample mean has nice properties of (1) unbiasedness; (2) consistency
Unbiasedness • E
Law of Large Number (Consistency) • If , then as In words, the sample mean gets closer and closer to the true population mean as the size of random sample rises
Simulation (Show Consistency) • clear • set obs 1000 • gen y = 4 + 2*invnormal(uniform()) • sum y in 1/10 • sum y in 1/100 • sum y in 1/1000
Discuss • How to use simulation to show unbiasedness?
Answer • We need to generate many samples. • Compute sample mean for each sample. • Unbiasedness means the average of those sample means should be close (or in theory identical) to the population mean
Causality • Most often, an economist goal is to infer that one variable (x) has a causal effect on another variable (y) • Example 1: x = price, y = quantity demanded • Example 2: x = wearing safety belt, y = death rate • Example 3: x = hosting Olympics Game, y = GDP
Ceteris Paribus • By definition, inferring causality requires ceteris paribus (CP) • CP means all other factors being equal (fixed, constant) • Without CP, we are not sure the change in Y is due to the change in X. • Exercise: what does CP mean when x = price, y = quantity demanded?
Real Problem • Someone thinks that driving on left (passing) lane causes more accidents • How to prove? • One (bad) answer: let’s pick an interstate, say I275. Consider I275 between exit 33 and exit 41. Each day between 8 am and 10 am we record the number of accidents that happen on the right lane and left lane. Then we do the mean-comparison test.
Discuss • Is I275 representative? • Is traffic between exit 33 and exit 41 representative? • Does the time (rush hour, night time) matter? • How to run a regression? • How about using the percentage (# accident / # traffic) rather than the number of accidents?
Fundamental Drawback • CP fails since the bad answer uses the observed data • Reckless drivers (W) tend to drive on left lane (X), and reckless driving causes more accidents (Y). • The observed association between X and Y tells nothing about causality since W is not held constant (CP fails).
Solutions • (I) use an experiment: randomly assign reckless drivers to left and right lanes. Then compare the mean using the experimental data. • (2) still use observed data, but run a multiple regression which includes as regressor the number of reckless drivers on the left lane. • (3) use fancy econometric models such as instrumental variable regressionif the number of reckless drivers on left lane is unobserved.
Discrimination Paper • (Race) Discrimination means that someone is treated unfairly just because of his skin color (even if he has high ability) • Using observed data cannot ensure ceteris paribus
Experimental Data • Obtained by using the fake resumes • Factors (characteristics) other than names (signal for skin colors) are made comparable • In other words, the name (skin color) is independent of ability. • E(xu)=0, so the key regressor is exogenous • Ceterius Paribus is ensured by using experimental data
Policy Implications • The punch line of this research is that job training program may help little for African-American, because the program may improve their skill, but cannot change their skin color.
Discuss • This is a very smart paper, why? • How about using observed data? Can we draw conclusion based on the observed salary difference between black and white? • How about market heterogeneity? Can we generalize the finding to other markets such as the market for college faculty? Why?
Tax Paper • Supply-Side Economics says that people work more (and so GDP rises) after tax cut • So the theory implies a causal effect of tax cut on labor supply
Discuss • Consider using the observed data, and run the regression of What does represent? Is E(xu)=0, or is exogenous? What is the consequence of using observed data?
Natural Experiment • There is a tax reform (natural experiment) • In 1987-1988, Iceland moved from a system under which taxes were paid on previous year's income to a pay-as-you-earn system. • So the tax rate for 1987 income became zero in an exogenous manner (has nothing to do with )
Policy Implications • Figure 1 shows that cutting tax leads to higher employment • Figure 2 shows a hike in GDP in 1987-1988 • Another paper that uses natural experiment: Home-equity Lending and Retail Spending: Evidence from a Natural Experiment in Texas
Discuss • Where is natural experiment? Reform, Law Change, Natural Event… • Q: how to show the causal effect of the number of children on labor hours of women? How to design a pure experiment? How about natural experiment?