340 likes | 442 Views
H615 Fall 2013 – Week 1. Introduction Cause and Effect Dan and Alysia (slides 1-7) Confounding, mediation, moderation (2-12) History of research design (13-25) Basics of Research Design Dan & Alysia (slides 8-20, including activity) Review and further thoughts on experimentation (26-36).
E N D
H615 Fall 2013 – Week 1 • Introduction • Cause and Effect • Dan and Alysia (slides 1-7) • Confounding, mediation, moderation (2-12) • History of research design (13-25) • Basics of Research Design • Dan & Alysia (slides 8-20, including activity) • Review and further thoughts on experimentation (26-36)
THE THEORY OF TRIADIC INFLUENCE – What is the CAUSE? Levels of Causation Ultimate CULTURAL SOCIAL BIOLOGY/ Causes ENVIRONMENT SITUATION PERSONALITY 1 2 3 4 5 6 a f Social/ Personal Nexus c d e b Sense of Information/ Interpersonal Others’ Social Interactions w/ Self/Control Opportunities Bonding Beh & Atts Competence Social Instit’s Distal Influences 7 8 9 10 11 12 g r p i q h k n m l j o Expectancies & Evaluations Self Skills: Motivation Perceived Values/ Knowledge/ Determination Social+General to Comply Norms Evaluations Expectancies 13 14 15 16 17 18 s x ATTITUDES SOCIAL SELF-EFFICACY u w v t Affect and Cognitions TOWARD THE NORMATIVE BEHAVIORAL BEHAVIOR BELIEFS CONTROL Proximal 19 20 21 Predictors Decisions A G B H C I D E F 22 K Experiences 23 Related Behaviors J Intrapersonal Stream Social/Normative Stream Cultural/Attitudinal Stream Biological/Nature Nurture/Cultural DECISIONS/INTENTIONS Trial Behavior EXPERIENCES: Psychological/Physiological -- Social Reinforcements -- Expectancies 2
Closely and less related behaviors: What are the Effects? Ultimate causes may be the same, distal predictors less so. P P P P P P P S S E S S S S E E E E E E S A Two Less and single closely less related behavior related behaviors. Eg, Eg, smoking, behaviors. Eg, smoking smoking drug abuse, and drinking sex, exercise. 4
Counterfactual Reasoning • Is fundamentally qualitative • Rubin, Holland et al., have statistical approaches • Central task for all cause-probing research (experimentation): • Create approximations to the counterfactual (that it is impossible to actually observe) • Two central tasks: • Create a high-quality, though imperfect source of counterfactual inference • Understand how this source differs from the treatment condition
Conditions for Inferring Cause • Due to John Stuart Mill • Temporal: the cause preceded the effect • Relationship: the cause was related to the effect • Plausibility: there is no other plausible alternative explanation for the effect other than the cause • In experiments, we: • Manipulate the cause and observe the subsequent effect • See if the cause and effect are related • Use various methods during and after the experiment to rule out plausible alternative explanations
Causation, Correlation, Confounds • Correlation does not prove causation!!! • Because ??? • Confounding variables • Third variables (in addition to the presumed cause and effect) that could be the real cause of both the presumed cause and effect
Manipulation & Causal Description • Experiments explore the effects of things that can be manipulated • Experiments identify the effects (DV) of a (manipulated) cause (IV) • Causal description, not explanation: • Knowledge of the effects tells us nothing about the mechanisms or causal processes – only that they occur • Descriptive causation is usually between molar causes and molar outcomes • Molar = package that contains many components
Mediation and Moderation • Many causal explanations consist of causal chains • Mediator variables and mediation • Some experiments vary the conditions under which a treatment is provided • Moderator variables and moderation • Some interventions work better for some subgroups (e.g., age, race, SES) than others • Moderation
What is Experimental Design? The creation of a counterfactual Creation/identification of comparison/control group/condition Method for assigning units to treatment and counterfactual Includes both • Strategies for organizing data collection, and • Data analysis procedures matched to those data collection strategies Classical treatments of design stress analysis procedures based on the analysis of variance (ANOVA) Other analysis procedures such as those based on hierarchical linear models or analysis of aggregates (e.g., class or school means) are also appropriate
Why Do We Need Experimental Design? Because of variability We would not need a science of experimental design if: • all units (students, teachers, & schools) were identical and • all units responded identically to treatments We need experimental design to control variability so that treatment effects can be identified
A Little History The idea of controlling variability through design has a long history In 1747 Sir James Lind’s studies of scurvy Their cases were as similar as I could have them. They all in general had putrid gums, spots and lassitude, with weakness of their knees. They lay together on one place … and had one diet common to all (Lind, 1753, p. 149) Lind then assigned six different treatments to groups of patients
A Little History The idea of random assignment was not obvious and took time to catch on In 1648 von Helmont carried out one randomization in a trial of bloodletting for fevers In 1904 Karl Pearson suggested matching and alternation in typhoid trials Amberson, et al. (1931) carried out a trial with one randomization In 1937 Sir Bradford Hill advocated alternation of patients in trials Diehl, et al. (1938) carried out a trial that is sometimes referred to as randomized, but it actually used alternation
A Little History Studies in crop variation I – VI (1921 – 1929) In 1919 a statistician named Fisher was hired at Rothamsted agricultural station They had a lot of observational data on crop yields and hoped a statistician could analyze it to find effects of various treatments All he had to do was sort out the effects of confounding variables
Studies in Crop Variation I (1921) Fisher does regression analyses—lots of them—to study (and get rid of) the effects of confounders • soil fertility gradients • drainage • effects of rainfall • effects of temperature and weather, etc. Fisher does qualitative work to sort out anomalies Conclusion The effects of confounders are typically larger than those of the systematic effects we want to study
Studies in Crop Variation II (1923) Fisher invents • Basic principles of experimental design • Control of variation by randomization • Analysis of variance
Studies in Crop Variation IV and VI Studies in Crop variation IV (1927) Fisher invents analysis of covariance to combine statistical control and control by randomization Studies in crop variation VI (1929) Fisher refines the theory of experimental design, introducing most other key concepts known today
Principles of Experimental Design Experimental design controls background variability so that systematic effects of treatments can be observed Three basic principles • Control by matching • Control by randomization • Control by statistical adjustment Their importance is in that order
Control by Matching Known sources of variation may be eliminated by matching Eliminating genetic variation Compare animals from the same litter of mice Eliminating district or school effects Compare students within districts or schools However matching is limited • matching is only possible on observable characteristics • perfect matching is not always possible • matching inherently limits generalizability by removing (possibly desired) variation
Control by Matching Matching ensures that groups compared are alike on specific known and observable characteristics (in principle, everything we have thought of) Wouldn’t it be great if there were a method of making groups alike on not only everything we have thought of, but everything we didn’t think of too? There is such a method
Control by Randomization Matching controls for the effects of variation due to specific observable characteristics Randomization controls for the effects of all (observable or non-observable, known or unknown) characteristics Randomization makes groups equivalent (on average) on all variables (known and unknown, observable or not) Randomization also gives us a way to assess whether differences after treatment are larger than would be expected due to chance.
Control by Randomization Random assignment is not assignment with no particular rule. It is a purposeful process Assignment is made at random. This does not mean that the experimenter writes down the names of the varieties in any order that occurs to him, but that he carries out a physical experimental process of randomization, using means which shall ensure that each variety will have an equal chance of being tested on any particular plot of ground (Fisher, 1935, p. 51)
Control by Randomization Random assignment of schools or classrooms is not assignment with no particular rule. It is a purposeful process Assignment of schools to treatments is made at random. This does not mean that the experimenter assigns schools to treatments in any order that occurs to her, but that she carries out a physical experimental process of randomization, using means which shall ensure that each treatment will have an equal chance of being tested in any particular school (Hedges, 2007)
The Randomized Experiment – The “Gold Standard” • Random assignment: • Creates two or more groups of units (people, classrooms, schools, places) that are probabilistically similar to each other on average • If done properly • When Ns are large enough • A Randomized Experiment Yields: • An estimate of the size of a treatment effect • the “population average causal effect” • That has desirable statistical properties • An estimate of the probability that the true effect falls within a defined confidence interval
Quasi-Experiments • The cause is manipulated • The cause occurs before the effect • However, there is less compelling support for counterfactual inferences • That is, there are more plausible alternative explanations • Take a “falsification” approach • That is, requires researchers to identify and falsify plausible alternatives that might falsify a causal claim
Fallible falsification • Disconfirmation is not always complete • May lead to modification of aspects of a theory • Requires measures that are perfectly valid reflections of the theory being tested • this is difficult to achieve • Observations are more fact-like when they: • Are repeated across multiple conceptions of the construct • Are repeated across multiple measures • Are repeated at multiple times • Plausible alternatives depend on social consensus, shared experience and empirical data
Generalization • The major limitation of RCTs • Most experiments are localized and particularistic • Causal generalization (Cronbach): • Units (people or places) • Treatments (Tx) • Observations of the units with and without Tx • Settings in which the treatments are provided and observations made • Two+ levels of generalization • How to generalize to the domain UTOS (construct validity) • Generalizing to the utos not directly observed in the experiment- *UTOS (external validity)
Causal Generalization • To obtain total generalization would require random sampling of units, treatments, observations (measures) and settings • This is never achieved • Note difference between a random sampling of units (people) and random assignment to condition • Causal Generalization requires: • Surface similarity • Ruling out irrelevancies • Making discriminations • Interpolation and extrapolation • Causal explanation
Control by Statistical Adjustment Control by statistical adjustment is a form of pseudo-matching It uses statistical relations to simulate matching Statistical control is important for increasing precision but should not be relied upon to control biases that may exist prior to assignment
Rubin’s Causal Model (RCM) • The causal effect is the difference between what would have happened to the participant in the treatment condition and what would have happened to the same participant if he or she had instead been in the control condition • The counterfactual account of causality? • No - Rubin does not characterize RCM as a counterfactual account. • He prefers the “potential outcomes” conceptualization – a formal statistical model rather than a philosophy of causation • Of course, this is impossible to observe, so all manipulations and statistical adjustments are designed to approximate it • Unobserved values are assumed to be missing (completely at random in RCTs)
The Statistical Approach • The population average causal effect • Relies on assumption of independence between pretreatment characteristics and condition assignment • In non-randomized studies, the RCM relies on statistical approaches to adjust for potential confounders • However, as Raudenbush (2005) points out: • “No matter how many potential confounders [analysts] identify and control, the burden of proof is always on the [analysts] to argue that no important confounders have been omitted” (p. 28).
SUTVA • Stable Unit Treatment Value Assumption • Asserts that the representation of potential outcomes and effects reflects all values that could be observed • No interference between units, that the outcome observed on one unit is not affected by the treatment received by another unit • Nesting of units is a common violation
Different purposes of CCM and RCM CCM aspires to be a theory of generalized causal inference that covers all aspects of many kinds of inferences we could make from all sorts of cause-probing studies RCM aspires to define an effect in a clear and precise way that allows that effect to be better measured in a single experiment It is a bandwidth vs fidelity issue
CCM and RCM both like: • Causal description rather than explanation • Discovering the effects of known causes • rather than the unknown causes of known effects (epidemiology) • Central role of manipulable causes • The use of strong design, especially RCTs • Matching RCM added propensity scores • Applying extra effort to address causal inference in non-randomized experiments and observational studies