390 likes | 588 Views
QE1 Review Session STATISTICS. Giovanni Oppenheim April 29, 2004 oppy@princeton.edu. QE1 Review Session STATISTICS. Based on California Electricity Market QE1 2001. Structure of the session. Detailed analysis of stats questions in QE1 2001 General overview of 507 material Stats Q&A
E N D
QE1 Review Session STATISTICS Giovanni Oppenheim April 29, 2004 oppy@princeton.edu
QE1 Review Session STATISTICS Based on California Electricity Market QE1 2001
Structure of the session • Detailed analysis of stats questions in QE1 2001 • General overview of 507 material • Stats Q&A • Summary analysis of Economics and Politics/Psychology questions in QE1 2001
Stats questions in QE1 2001 • Q4: One of Gee’s main points is that if retail prices are allowed to increase when electricity is scare then Calilfornians will demand less energy. To support his point he conducted a statistical analysis, the results of which are in his memo. It’s been awhile since I was in school and I could use your help in understanding what he did. (a) First, would you please interpret the results from the regression for me. In English, what do the results tell us? Is there evidence of a strong relationship between prices and energy consumption?
Q4 in QE1 2001 • Regression Log(electricity consumption per capita) = 4.173 - 0.899 log(cents per kWh) (0.182) (0.096) Number of observations: 50 R2 = 0.641 (Standard errors are in parentheses below the coefficients)
Q4 in QE1 2001 • 2 points to be answered: • 1. What do the results tell us? • 2. Is there evidence of a strong relationship between prices and energy consumption? • Follow Laity’s handout: • How to do statistical analysis on the QE1 • Posted as “Hints” on the QE Information Web page: http://www.wws.princeton.edu/~grad/qeinfo/
Q4 in QE1 2001 • What do the results tell us? • 1. Units of the variables determine the “meaning” of the relationship. Here the variables are in logs (both Y and X): the coefficient on price is an elasticity. Sign of the coefficient determines the direction of the relationship: here it is negative. log(electricity consumption per capita) = 4.173 -0.899log(cents per kWh) (0.182) (0.096) A 1% increase in the price of electricity (measured in cents per kWh) is associated on average with an estimated 0.9% decrease in the consumption of energy (measured, presumably, in kWh per year)
Q4 in QE1 2001 • Try to picture the relationship (be careful: units in Green memo are NOT logs)
Q4 in QE1 2001 • Try to picture the relationship (be careful: units in Green memo are NOT logs) • Regressions always show average associations. • Always ask yourself: “What is an observation”? Observations are 50 States: association shown is across states in 1999. • “Association” means “correlation”, not “causation”.
Q4 in QE1 2001 • Is there evidence of a strong relationship between prices and energy consumption? • Strong means both “statistically strong” and “economically strong”. • Statistically: • For every single independent variable: could a different sample give me a “very different” relationship (say positive)? • T-stat = coefficient/(standard error) • Here: |t-stat| = |0.899/0.096| = 9.36 > 1.96 • The estimated coefficient on price is statistically significant at 5% level.
Q4 in QE1 2001 • Is there evidence of a strong relationship between prices and energy consumption? • Strong means both “statistically strong” and “economically strong”. • Statistically: • For all independent variables as a whole: are my variables explaining a good portion of variation in the dependent variable? • R2 = 0.641 very high
Q4 in QE1 2001 • Is there evidence of a strong relationship between prices and energy consumption? • Strong means both “statistically strong” and “economically strong”. • Economically: • The elasticity of -0.9 identifies an almost unit elastic relationship.
Q4 in QE1 2001 • Q4: One of Gee’s main points is that if retail prices are allowed to increase when electricity is scare then Calilfornians will demand less energy. To support his point he conducted a statistical analysis, the results of which are in his memo. It’s been awhile since I was in school and I could use your help in understanding what he did. (b) Second, Gee asserts (quite confidently, I might add) that a 25% increase in electricity prices would have reduced demand by 22.5%. Please, walk me through his calculation. How did he reach such a conclusion?
Q4 in QE1 2001 • Elasticity = 0.9 means that if prices go up by 1% consumption goes down by 0.9%. • If prices go up by (25)∙(1%), consumption goes down by (25)∙(0.9%), or 22.5%.
Q4 in QE1 2001 • Q4: One of Gee’s main points is that if retail prices are allowed to increase when electricity is scare then Calilfornians will demand less energy. To support his point he conducted a statistical analysis, the results of which are in his memo. It’s been awhile since I was in school and I could use your help in understanding what he did. (c) Finally, given the sensitivity of the situation I would hate to predict publicly that we could reduce electricity consumption by 22.5% by allowing prices to increase by 25% and then be proven wrong. Should I be concerned about making such a pronouncement based on Gee’s analysis? Please explain to me why or why not.
Q4 in QE1 2001 • General points about validity of regression • Regression shows correlation not causation The prediction is unwarranted, because we can’t interpret the relationship identified to be causal. Causation can be clearly identified in an experimental setting. An alternative, in regression analysis, is to have variables that are exogenous to the relationship estimated, and use them as “instruments” for the causal variable. If you want to identify a demand curve, you must instrument price using some factors that affect onlysupply.
Q4 in QE1 2001 • General points about validity of regression • Omitted variable(s) bias Is the coefficient really measuring the relationship between the two variables, or is it picking up (even partially) the effect of some omitted variable? Example: unanticipated weather conditions.
Q4 in QE1 2001 • Are you trying to predict “out-of-sample”? Look at the graph: California is already a very low consumer of electricity: increasing prices by 25% will bring you to the margins of your sample. • General points about validity of prediction
Q4 in QE1 2001 • General points about validity of prediction • Are you using the right sample? Observations here are 50 states at one point in time. You want predictions for CA over time. Are these the right data to look at?
Q4 in QE1 2001 • Questions on Q4?
Stats questions in QE1 2001 • Q6: Gee is absolutely right that we have to keep our eye on public opinion. However, it seems to me that Gee has it backwards regarding the what polling data are telling us. Specifically, Gee writes that the data suggest that household would rather face higher prices and reliable electricity than face lower prices but have unreliable electricity. Don’t the data actually indicate that households in SD are less satisfied with my policies towards electricity than are those in SF? Could you please explain Gee’s reasoning? Do you agree with his approach and his conclusions?
Q6 in QE1 2001 • This is a mixed question (politics/stats). This year (like last year) questions will be identified. • Look at the table in the Green memo.
Q6 in QE1 2001 • What distinguished SD from SF in 2000? In SD electricity retail prices were market-driven (SDG&E paid off its “stranded costs” in mid-1999). In SF retail prices were regulated (PG&E had not paid off its “stranded costs”). • What distinguishes mid-March 2000 from mid-August 2000? The energy crisis started in May 2000!
Q6 in QE1 2001 • General approach to policy change: “difference-in-difference” • Compare outcomes for “treatment group” to outcomes for “control group”, pre- and post- policy intervention • Outcome: approval ratings. • Pre-: March 2000. Post-: August 2000. • Treatment group: SD. Control group: SF. • When does it work? When there are no other “differential” changes in the same period across groups
Q6 in QE1 2001 • What does Gee say? “Clearly, these data indicate that households would rather face higher prices but reliable electricity (SD) than have lower (fixed) prices but unreliable electricity (SF). This can be seen by comparing the change in approval ratings before and after the start of the electricity crisis in SD to the change in SF.”
Q6 in QE1 2001 • What was the change in SD? From 40% to 35%, a 5 percentage points change. • What was the change in SF? From 60% to 50%, a 10 percentage points change. • What is Gee implying when he compares these two changes?
Q6 in QE1 2001 • Are the two changes (-0.05 in SD and -0.10 in SF) statistically different? • VARpre SD = (0.4)(1-0.4)/200 = 0.0012 • VARpost SD = (0.35)(1-0.35)/100 = 0.0023 • VARpre SF = (0.6)(1-0.6)/150 = 0.0016 • VARpost SF = (0.5)(1-0.5)/150 = 0.0017
Q6 in QE1 2001 • Are the two changes (-0.05 in SD and -0.10 in SF) statistically different? • Varpost-pre SD = VARpost SD + VARpre SD = 0.0035 • Varpost-pre SF = VARpost SF + VARpre SF = 0.0033 • Your aim is to determine the SE of the difference-in-difference: VarΔSD-ΔSF = VARΔSD + VARΔSF = 0.0068
Q6 in QE1 2001 • Are the two changes (-0.05 in SD and -0.10 in SF) statistically different? • Test the null hypothesis that ΔSD-ΔSF = 0 Z-score = [(-0.05) – (-0.10) – 0] / sqrt(0.0068) Z-score =0.05 / 0.0825 = 0.606 In absolute value it is lower than the critical value at 5% (1.96), so you cannot reject the null hypothesis.
Q6 in QE1 2001 • Conclusion? • Gee’s approach is DIFF-IN-DIFF, and it is in general valid, but there could have been other differential changes between SD and SF in Spring 2000 besides the energy crisis. • Even if there were no other changes, Gee’s conclusion is wrong: the diff-in-diff estimate is not significantly different from zero.
Q6 in QE1 2001 • Questions on Q6?
General overview of 507 material • Probability: Conditional Probability and Bayes’ Theorem • Update of prior beliefs
General overview of 507 material • Estimation of means: Sample mean and SE • Example: approval rate for Gov. Davis. (Binary variable)
General overview of 507 material • Confidence intervals and hypothesis testing • 2-sided tests v. 1-sided test • Confidence intervals & 2-sided tests • Example: Approval rate for Gov. Davis
General overview of 507 material • Difference in means between two samples • How to compute the standard error for the difference
General overview of 507 material • Regression • Specifications: LIN-LIN, LIN-LOG, LOG-LOG, Polynomial • Dummy variables • Interaction terms • LPM
QE1 Review Session STATISTICS THANK YOU and GOOD LUCK!