390 likes | 736 Views
Econometrics. Session 1 – Introduction Amine Ouazad, Asst. Prof. of Economics. Session 1 - Introduction. Preliminaries. Introduction. Who I am Arbitrage Textbook Grading Homework Implementation Session 1 The two econometric problems Randomization as the Golden Benchmark
E N D
Econometrics Session 1 – Introduction Amine Ouazad, Asst. Prof. of Economics
Session 1 - Introduction Preliminaries
Introduction • Who I am • Arbitrage • Textbook • Grading • Homework • Implementation Session 1 • The two econometric problems • Randomization as the Golden Benchmark Outline of the Course
Who I am • Applied empirical economist. • Work on urban economics, economics of education, applied econometrics in accounting. • Emphasis on the identification of causal effects. • Careful empirical work: clean data work, correct identification of causal effects. • Large datasets: • +100 million observations, administrative datasets, geographic information software. • Implementation of econometric procedures in Stata/Mata.
Trade-offs • Classroom is heterogeneous. • In tastes, mathematics level, needs, prior knowledge. • Different fields have different habits. • E.g. “endogeneity” is not an issue/the same issue in OB, Finance, Strategy, or TOM. • Conclusion: • Course provides a particular spin on econometrics, with mathematics when needed, applications. • This is a difficult course, even for students with a prior course in econometrics.
Textbooks • *William H. Greene, Econometrics, 6th edition. • Jeffrey Wooldridge, Econometrics of Cross Section and Panel Data. • Joshua Angrist and Jorn Steffen Pischke, Mostly Harmless Econometrics. • Applied Econometrics using Stata, Cameron et al.
Prerequisites • I assume you know: • Statistics • Random variables. • Moments of random variables (mean, variance, kurtosis, skewness). • Probabilities. • Real analysis • Integral of functions, derivatives. • Convergence of a function at x or at infinity. • Matrix algebra • Inverse, multiplication, projections.
Grading • Exam: 60% • Participation: 10% • Homework: 30% • One problem set in-between Econometrics A and B.
Implementation • STATA version 12. • License for PhD students. Ask IT. 5555 or Alina Jacquet. • Interactive mode, Do files, Mata programming. • Compulsory for this course. • MATLAB, not for everybody. • Coding econometric procedures yourself, e.g. GMM.
Outline for Session 1Introduction • Correlation and Causation • The Two Econometric Problems • Treatment Effects
Session 1 - Introduction 1. Correlation and Causation
1. The perils of confoundingcorrelation and causation • How can we boost children’s reading scores? • Shoe size is correlated with IQ. • Women earn less than men. • Sign of discrimination? • Health is negatively correlated with the number of days spent in hospital. • Do hospitals kill patients?
Potential outcomes framework • A.k.a the “Rubin causality model”. • Outcome with the treatment Y(1), outcome without the treatment Y(0). • Treatment status D=0,1. • FUNDAMENTAL PROBLEM OF ECONOMETRICS: Either Y(1) or Y(0) is observed, or, equivalently, Y=Y(1) D + Y(0) (1-D) is observed. • What would have happened if a given subject had received a different treatment?
Naïve estimator of the treatment effect • D=E(Y|D=1) – E(Y|D=0). • Does that identify any relevant parameter? • Notice that: • D= E(Y|D=1) – E(Y|D=0) = E(Y(1)|D=1)-E(Y(0)|D=0) • What are we looking for?
Ignorable Treatment (Rubin 1983) • Assume Y(1),Y(0) D. • Then E(Y(0)|D=1)=E(Y(0)|D=0)=E(Y(0)). • Similarly for Y(1). • Then:
Another Interpretation • Assume Y(D)=a+bD+e. • e is the “unobservables”. • The naïve estimator D=b+E(e|D=1)-E(e|D=0). • Selection bias: S=E(e|D=1)-E(e|D=0). • Overestimates the effect if S>0 • Underestimates the effect if S<0.
Definitions • Treatment Effect. • Y(1)-Y(0) • Average Treatment Effect. • E(Y(1)-Y(0)) • Average Treatment on the Treated. • E(Y(1)-Y(0)|D=1) • Average Treatment on the Untreated. • E(Y(1)-Y(0)|D=0)
Randomizationas the Golden Benchmark • Effect of a medical treatment. • Treatment and control group. • Randomization of the assignment to the treatment and to the control. • Why randomize? • … effect of jumping without a parachute on the probability of death.
With ignorability… • If the treatment is ignorable (e.g. if the treatment has been randomly assigned to subjects) then • ATE = ATT = ATU
Selection bias • Why is there a selection bias? • In medecine, in economics, in management? • Self-selection of subjects into the treatment. • Correlation between unobservables and observables, e.g. industry, gender, income.
Session 1 - Introduction 2. The Two econometric problems
2. The Two Econometric Problems • Identification and Inference • “Studies of identification seek to characterize the conclusions that could be drawn if one could use the sampling process to obtain an unlimited number of observations.” • “Studies of statistical inference seek to characterize the generally weaker conclusions that can be drawn from a finite number of observations.”
Identification vs inference • Consider a survey of a random subset of 1,302 French individuals. • Identification: • Can you identify the average income in France? • Inference: • How close to the true average income is the mean income in the sample? • i.e. what is the confidence interval around the estimate of the average income in Singapore?
Identification vs inference • Consider a lab experiment with 9 rats, randomly assigned to a treatment group and a control group. • Identification: • Can you identify the effect of the medication on the rats using the random assignment? • Inference: • With 9 rats, can you say anything about the effectiveness of the medication?
This session • This sessionhas focused on identification. • i.e. I assume we have a potentially infinite dataset. • I focus on the conditions for the identification of the causal effect of a variable. • Next session: what problems appear because we have a limited number of observations?
Session 1 - Introduction Looking forward:Outline of the course
Outline of the course • Introduction: Identification • Introduction: Inference • Linear Regression • Identification Issues in Linear Regressions • Inference Issues in Linear Regressions
Identification in Simultaneous Equation Models • Instrumental variable (IV) estimation • Finding IVs: Identification strategies • Panel data analysis
Bootstrap • Generalized Method of Moments (GMM) • GMM: Dynamic Panel Data estimation • Maximum Likelihood (ML): Introduction • ML: Probit and Logit
ML: Heckman selection models • ML: Truncation and censoring + Exercise/Review session + Exam