1 / 37

Methodological Workshop 1: Research Design

Methodological Workshop 1: Research Design. Yu Xie University of Michigan. Otis Dudley Duncan. “ But sociology is not like physics. Nothing but physics is like physics, because any understanding of the world that is like the physicist’s understanding becomes part of physics…”

widner
Download Presentation

Methodological Workshop 1: Research Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methodological Workshop 1:Research Design Yu XieUniversity of Michigan

  2. Otis Dudley Duncan • “But sociology is not like physics. Nothing but physics is like physics, because any understanding of the world that is like the physicist’s understanding becomes part of physics…” • (Otis Dudley Duncan. 1984. Notes on Social Measurement. p.169)

  3. First Principle of Social Science • Variability is the very essence of social science research. • “Variability Principle.” • We are interested in understanding how social outcomes vary across members in a human population and over time. • Mortality example.

  4. Second Principle • Social grouping reduces such variability. • “Social Grouping Principle.” • We seek to understand patterns of “between-group” variations in social outcomes. • Mortality example.

  5. Third Principle • Patterns of population variability vary with social context, which is often defined by time and space. • “Social Context Principle” • Patterns of between-group variations vary by social context. • Mortality example: is the education-mortality relationship reduced or eliminated through social policy?

  6. Different “Regimes” of Variability • Social contexts are different from social groups in that the former are self-contained social systems with natural boundaries, for example by time and space. • Patterns of individual variability may be governed by “relationships” between individuals that are not reducible to individuals’ attributes. • Patterns of individual variability may be governed by macro-level conditions such as “social structure,” “political structure,” or “culture,” which may be discontinuous and fixed. • Collective action may lead to changes of macro-level conditions and human relationships –major sources of social change.

  7. Population Thinking and Statistics • In typological thinking, deviations from the mean are nothing but “errors,” with the mean approaching the true cause. (Example: measurement of the speed of sound.) • In populationthinking, deviations are the reality of substantive importance; the mean is a property of a population.

  8. Two Views of Regression • Gaussian View (Typological Thinking): • Observed Data = Constant Model + Measurement Error • Example: yi = m + ei, where m is a true constant. • Galtonian View (Population Thinking): • Observed Data = Systematic (between-group) Variability + Remaining (within-group) Variability • Example: yi = m + ei,where m=exp(Y).

  9. Potential Biases in Regression Analysis • Yi = a + diDi + ei There are two types of variability that may cause biases: • (1) Pre-treatment heterogeneity bias : ei. If corr(e,,D)≠0, => pre-treatment heterogeneity bias. • (2) Treatment-effect heterogeneity bias : di If corr(d,,D)≠0, => treatment-effect heterogeneity bias.

  10. Comment • When the first form of heterogeneity bias is present, we may have “spurious” causal effect. • “Omitted variable bias” • “Correlation does not equal causation.” • Example D Y U e

  11. Comment • Second form of heterogeneity bias may result from rational “anticipatory behavior.” • Problem of “self-selection.” • Example D Y U e

  12. Yu Xie’s “Fundamental Paradox in Social Science” • There is always variability at the individual level. • Causal inference is impossible at the individual level and thus always requires statistical analysis at the group level on the basis of some homogeneity assumption.

  13. Key Difficulties of a Research Design • (1) How do we know that results based on your “comparison” are valid?  • “Internal validity” • (2) How do we know that results based on your “comparison” hold true in other settings?  • “External validity”

  14. Research Design Possibilities • Social Experiments (Randomization) • Structural Approach • Multivariate Analysis (Social Grouping Principle) • Multi-level Analysis (Social Context Principle) • “Quasi-Experimental Designs” or “Natural Experiments”. • Instrumental Variables (Randomization) • Regression Discontinuity (Social Context Principle) • Utilizing Spatial Variation (Social Context Principle) • Utilizing Temporal Variation (Social Context Principle) • Clustering Design • Fixed Effects Model (Social Grouping Principle)

  15. Three Key Features of a Good Paper • The harmonious trio: Theory, Design, and Evidence. All need to be in place. • A good theoretical/conceptual framework –> research question. • A good research design -> matching empirical data to research question). • Good data analysis -> results that address the research question. • Tight integration of the three.

  16. Why Focus on Small Topics? • Socratic method of inquiry in the western tradition. • True knowledge can stand harsh criticisms. • Many important, big questions are not researchable questions, such as value of life. • From small to big, accumulation of knowledge. • “Demographic tradition” under Duncan’s influence.

  17. Experimental Approach • Experimental design eliminates both forms of the heterogeneity biases. • Example: High/Scope Perry Preschool study conducted in Ypsilanti. • Manski and Garfinkel (1992): experimental designs suffer from shortcomings that are often overlooked. • Manski and Garfinkel refer to experimental approach as “reduced-form.”

  18. Shortcomings of Experimental Approach • We cannot always extrapolate results from an experimental setting to natural setting. • Thus, Manski and Garfinkel openly criticize experimental designs:"In fact, reduced-form experimental evaluation actually requires that a highly specific and suspect structural assumption hold: Individuals and organizations must respond in the same way to the experimental version of a program as they would to the actual version." (p.17) • I.e., lacking “external validity.”

  19. Structural Approach • Manski and Garfinkel propose the "structural" approach as an alternative. • Definition: structural approach refers to statistical methods that model causal processes based on observational data. • Head Start example: control on SES, parental involvement, etc. • Requires strong social science theories.

  20. Comparison of the two Approaches Advantages of Structural Approach: • Since it is conducted in a natural setting, its findings are directly relevant to the whole population. In contrast, results from an experimental design need to be extrapolated. • It is less costly. In contrast, experimental research is very expensive. • It builds upon and contributes to theory. In contract, the reduced-form approach only yield simple answers to simple questions.

  21. Advantages of Reduced-form Approach • Biases due to unobservables can be eliminated through randomization. • It requires fewer assumptions. • It does not require complicated statistical models that the public and government officials have difficulty understanding.

  22. Beyond the Variability Principle • Use of social grouping principle allows us to better understand group-specific properties, i.e., between-group analyses. • Useful as a descriptive tool. No assumption is needed. • Application of Galtonian regression: • Regression = E(Y|X), X denotes group

  23. Using Social Grouping to Control for Heterogeneity • Social grouping always reduces variability => implies within-group homogeneity. • We may assume that meaningful heterogeneity and endogeneity can be captured by social grouping (still wishful thinking). • Assumptions (comment 5) are more plausible after social grouping than before.

  24. Multiple Regression • Change regression to: • Yi = a + dDi +b’Xi + ei • Interpretation of d: • Treatment effect within levels of X, or controlling for X. D Y X e

  25. Comment • For X to do this, it needs to be correlated with D (“correlation condition,” c1) and affects Y (“relevance condition,” c2). • X should be pre-treatment, determining both D and Y structurally. D Y c1 c2 X e

  26. Examples: Quasi-Experiment Design Utilizing Spatial Variation • Certain policies are introduced in State A but not in State B. • States A and B are otherwise comparable. • Observe how outcome Y differs between State A and State B. • Pace of economic reforms in China differs greatly by region • Associate regional variation in returns to education to regional variation in depth of economic reforms.

  27. Examples: Quasi-Experiment Design Utilizing Temporal Variation • Declining significance of race? • Examine temporal changes in SES differences by race • Hope to see a narrowing of racial gaps, particularly after the civil rights movement. • Effect of a new instructional method:

  28. INSTRUMENTAL VARIABLES • WHAT ARE INSTRUMENTS? • Intuitively, instruments are variables that move around the probability of participation but do not affect outcomes other than through their effect on participation. • Put more statistically, instruments are variables that are correlated with the endogenous variable – in this context the treatment indicator – but not correlated with the unobservable in the outcome equation.

  29. Instrumental-Variable Approach • Condition: IVZ affects Y only through X, meaning: • Z is correlated with Y but does not affect Y directly (called “exclusion restriction”). • Z is also correlated with X but not perfectly. • It’s very hard to find a good Z. Y X Z U

  30. WHERE DO INSTRUMENTS COME FROM? • Theory combined with clever data collection • Ex: Lottery number of military enlistment (Angrist 1990) • Ex: distance as in Card (1995)

  31. COMMON EFFECT IV EXAMPLE I • A training center serves two towns: the near town and the far town. • The impact of training on those who take it is 10, while the outcome in the absence of training is 100. • For those in the near town, the cost is zero for everyone. In the far town, for those with a car the cost is essentially zero; for those without one the cost is 10. • Assume that a random half of the eligible persons have a car and that there are 200 eligible persons in each town. • Assume also that everyone knows their cost of training and their benefits from training, and participates only when the benefits exceed the costs.

  32. COMMON EFFECT IV EXAMPLE II • Let Z =1 denote residence in the near town and Z = 0 denote residence in the far town. • Using our standard notation: • Pr(D=1|Z=1)=1 • Pr(D=1|Z=0)=0.5 • Pr(Y=1|Z=1)=YC + d Pr(D=1|Z=1) =100+10*1.0 = 110 • Pr(Y=1|Z=0)=YC + d Pr(D=1|Z=0) =100+10*0.5 = 105

  33. COMMON EFFECT IV EXAMPLE – III • The IV estimator in this simple case is given by: • Inserting the numbers from the example into the formula gives:

  34. A CONTINUOUS INSTRUMENT IN A COMMON EFFECT WORLD • The two-stage least squares estimator is commonly used in this case. • In the first stage, the endogenous variable (i.e., the treatment indicator) is regressed on all the exogenous variables, including the instrument. • The second-stage outcome equation regression then includes the predicted value of the endogenous variable rather than the endogenous variable itself. • Standard errors must be corrected to account for the first-stage estimation. Most software packages now do this for you. (ivreg command in Stata.)

  35. A Complication: When Treatment Effects are Heterogeneous • IV Estimator is turned to Local Average Treatment Effect (LATE): average treatment effect for those persons whose treatment status is affected by random assignment. • Also called “principal stratification approach.” (Angrist, Imbens, and Rubin. 1996; Little, and Yau 1998)

  36. Classification of Compliance Status T Treatment received 0 1 Compliers Never-takers Defiers Always-takers 0 R Assignment Compliers Always-takers Defiers Never-takers 1 0 = control 1 = treatment

  37. References • Angrist, Joshua. 1990. “Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records” American Economic Review, 80: 313-36. • Angrist, J. D., G.W. Imbens, and D.B. Rubin. 1996. “Identification of Causal Effects Using Instrumental Variables.” Journal of the American Statistical Association 91(434): 444-455. • Card, David. 1995. “Using Geographic Variation in College Proximity to Estimate the Return to Schooling.” Pp. 201-222 in Aspects of Labour Market Behavior: Essays in Honour of John Vanderkamp, ed. by Louis Christofides, E. Kenneth Grant, and Robert Swidinsky. Toronto: University of Toronto Press. • Little, Roderick J. & Yau, Linda H.Y. 1998. “Statistical Techniques for Analyzing Data from Prevention Trials: Treatment of No-shows Using Rubin's Causal Model.” Psychological Methods 3(2):147-159. • Manski, C.F., and Garfinkel, I. 1992. “Introduction.” Pp.1-21 in Evaluating Welfare and Training Programs, edited by Manski, Charles F. and Irwin Garfinkel. Cambridge, MA: Harvard University Press.

More Related