1 / 16

Estimating Counterfactual Densities with DFL Ado-file: A Constructive Critique

This article discusses the estimation of counterfactual densities using the DFL Ado-file in STATA. It provides a constructive critique and demonstrates its application in predicting distributions in different subgroups.

kiyoko
Download Presentation

Estimating Counterfactual Densities with DFL Ado-file: A Constructive Critique

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive critique. Luis Huesca Reynoso Centro de Investigación en Alimentación y Desarrollo, A.C. Department of Economics. Email: lhuesca@ciad.mx April 23, 2009, Universidad Iberoamericana Campus Mexico.

  2. Itisnotaneasytaskdealingwithdistributions (and so withdensities!) Problemstoface: Scale: log ornumeric. Comparisson: Unit of measurement (in economics and social sciencies: constantprices, others. Selection of therightwindowwidth (eye-ballsightortheoptimal) –checkoutforinstancebandwby Salgado-Ugarte, Shimizu and Taniuchi- Joint: Compute themtoghether (seeforinstancenbinsor # of gridpoints in akdensity). STATA makesiteasier! Goal.- Theestimation of kerneldensityfunctions and counterfactualswelldimensionedwith a semiparametrictechnique: Estimate densities that stands for obtaining the real shape not only for the total distribution but also for a number of subgroups belonging to the former.

  3. Probabilitydensityfunction (PDF) Anyfunction, f(y) can serve as a densityfunction as long as: and A general kernelfunctionK(u)toweightthedensitymustthenbe, Sincethen • By definition, the sum of the PDF must add to one as so for the Gaussian or any other nice kernelfunctions (Duclos, 2001 & Silverman, 1986) –Epanechnikov, biweight , triangular, cosinekernelsforinstance-.

  4. Kerneldensityestimation: Lettingthe data speakbythemselves as follows: With as a vector of earnings, htheoptimalwindowwidth and K a Gaussian kernelfunction. Following Jenkins and Van Kerm, (2005) for decompositions: • as a weigthed sum of the FDPs for each sub-group k, where stands for the population share of the group k, and as the PDF of the group k. • - In theempiricalexampleanadaptivekernelestimatorisused (Van Kerm, 2003).

  5. Dinardo, Fortin, Lemiux (1996) Counterfactual estimation compares the objective variable (depvar) distribution to the depvar distribution that would have prevailed if they had been paid like the comparison group (the counterpart). Actual wagedistributionsfor A and B Actual Counterfactual DFL (1996) rewrite and reweighthedensityfor B as follows: In Stata: w = 1-Prob(Depvar=1)/Prob(Depvar=0) Which can becomputedusingBayes’ theorem: • Theconditionaltreatmentprobability – propensity score – isestimatedbytheprogramunder a especificationusing a logisticregression (DFLcommandshiftstoprobit as well). Forcomparisson I use thepscoreadofilewrittenby Becker & Ichino (2002) whichfollowstheneirestneighbourtechnique.

  6. Empirical case: (A semi-parametric-approach) Estimation of themexicanearningsdistribution and decompositionsby sub-population of workers in the formal and informal sectors (compliancewith social securitycoverage). (Let’sassumethatself-selectionbiasdoesnotaffect individual decisions of worker’slocation). Models are estimatedseparatelyforeachcategory. Logit has a practicaladvantageoverprobitwhenthesum of predictedvaluesequaltothesum of empiricallyobservedvalues (Butcher and Dinardo, 1998.) ENEU: Encuesta Nacional de Empleo Urbano (NationalSurvey of UrbanEmployment). Males agingfrom 16 to 65 Occupations = (1 ,…, 4) 1: Formal self-employed 2: Informal self-employed 3: Formal wage-earners 4: Informal wage-earners Model 1 pooled Model 2 pooled

  7. Syntax Compute theearningsdistributionusing DFL command. dfldepvarindepvars [ifexp] [in range] , outcome(varname) [nbins(integer) w(bandwidth) adaptive gauss quietlyprobit [logit default] graph(cfactual) graph_combineaxis_selection_optionsaxis_scale_optionstitle_options dfl informal esceda eda2 jefe dmiembrosdwmenor drama1 drama3 /// drama4 dregion1 dregion2 dregion3 dregion4 dregion6 /// if sex==1 & logitp>=1 & logitp<=2, outcome(logwm) nbins(50) /// adaptive gauss graph(cfactual) 2. Compute theearningsdistributionusingdo-file. pscoreinformalbesceda eda2 jefe dmiembrosdwmenor drama1 drama3 drama4 /// dregion1 dregion2 dregion3 dregion4 dregion6 ifsex==1 & logitp>=1 & logitp<=2, /// pscore(mypscore) logitlevel(0.001) akdensitylogwmif sex==1 & logitp==4 [aw = mypscore], gau s(i) /// gen(hai92c dhai92c) labvar dhai92c “Informal wage-earner" replace dhai92c = dhai92c*.24 Examplewithmy do-file

  8. Figure 1.

  9. DFL command Do filereescaled Figure 2. Wage-earners in Mexicoworking in a formal world, 1992.

  10. Do filereescaledadjustingranges Figure 2a. Wage-earners in Mexicoworking in a formal world, 1992.

  11. DFL command Do filereescaled Figure 3. Self-employed in Mexicoworking in a formal world, 1992.

  12. Do filereescaledadjustingranges Figure 3a. Self-employed in Mexicoworking in a formal world, 1992.

  13. Conclusions : DFL user written command is useful just watch out when using sub-groups or log scales. DFL (1996) use the subgroup decomposability property of the aggregate PDF. A suggestion when computing densities, consider population shares (if necessary) to weight them. The problem of obtaining over-dimensioned densities struggles the most when dealing with logarithmic scales for data. For kernel densities the estimation with the adaptive technique is more time-consuming but seems to be more accurate as well (it works better without smoothing more than needed). Adaptive kernel estimation depicts better bimodal or multimodal distributions

  14. References Azevedo, Joao Pedro (2005). DiNardo, Fortin and Lemieux Counterfacual Kernel Density –DFL user written command-”. Becker, Sascha O., and Andrea Ichino (2002), “Estimation of average treatment effects based on propensity scores”, The Stata Journal, 2(4), 358-377. Butcher, K. F. and John Dinardo (1998), “The immigrant and native-born wage distributions: Evidence from united states census”, NBER Working paper No. 6630. Dinardo, John, Nicole Fortin, and Thomas Lemieux (1996), “Labor Market Institutions and the Distribution of Wages, 1973-1992: A semi-parametric approach”,Econometrica, 64(5), 1001-44. Duclos, Jean-Yves (2001), “Non-parametric estimation for distributive analysis”, Poverty and Equity: theory and estimation, Departamentd’EconomiaAplicada, UniversitatAutònoma de Barcelona, mimeo, March, 37-44. Huesca, Luis and Mario Camberos (2009), "El mercado laboral mexicano 1992 y 2002: Un análisis contrafactual de los cambios en la informalidad", Economía Mexicana, Vol. XVIII, Núm. 1, primer semestre, pp. 5-43. Heckman, James, Ichimura, H. and Todd, P. E. (1998), "Matching as an Econometric Evaluation Estimator", Review of Economic Studies, 65, 261-294. Inegi (2006), Encuesta Nacional de Empleo Urbano, 1992 and 2002, ENEU, INEGI, Ags.,México, Bases de datos. Jenkins, Stephen and Phillipe Van Kerm (2005), “Accounting for income distribution trends: A density function decomposition approach”, Journal of Economic Inequality, 3, pp. 43-62. Silverman, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall. London. Van-Kerm, Phillipe (2003), “Adaptive kernel density estimation”, -akdensity-TheStataJournal, 3(2), 148-56.

More Related