The Classic Experiment (and Its Limitations)

The Classic Experiment (and Its Limitations) Class 6

Stages of the Research Process • Research process begins with a hypothesis about a presumed (causal?) relationship between an independent and a dependent variable • We also might assume that there are conditioning variables, as well • The elements of a test of this hypothesis are: • Research design to assess the relationships between the variables • Recruiting subjects for testing the hypotheses • Valid and reliable measurement of the variables • Appropriate methods of statistical analysis that permit inferential conclusions about the hypothesis

Research Designs • Today, we discuss research designs, focusing on experiments. • Contrast this with an epidemiological model, where we infer that group differences are attributable to the hypothesized effect in a population. • In an experiment, we attempt to control for those differences between groups, so that any differences we observe between groups is attributable to the test, and not to the group differences • This is why experiments are considered a “gold standard” in identifying a causal relationship between a dependent and an independent variable. • Obviously, experiments are not always feasible • Their strengths and limitations fuel endless debates, and have become a battleground for litigants seeking to assess a pattern of facts • Examples from video games, alcohol and car crashes

Types of Research Designs • Case studies • Good for generating hypotheses, for understanding and illustrating causal linkages • Not good for testing hypotheses, or for generalizing to other populations • Correlational studies • Studies that assess simultaneous changes in independent and dependent variables. • Example: income levels and voter preferences on surveys • Example: diet and disease (epi causation model • You can still make predictions from correlational studies if you have ruled out other causes, but you cannot achieve “control” without understanding directionality of effect. • True experiments • Random assignment of subjects to groups, unequal treatment of similarly situated people….. ‘but for…’ causation • Examples: Perry Pre-School, clinical drug trials • Quasi-experiments • Nonrandom assignment, with approximations and control for between-group differences.

Why are experiments the gold standard? • An experiment is a design for testing hypotheses regarding the empirical relationship between an independent and a dependent variable • It is the most efficient and reliable way to rule out spurious causation (rival hypotheses) through random assignment of individuals to test conditions, and therefore to establish conditions for causal inference. • Causality is critical for the scientific goals of “explanation," "prediction" and “control.”

Why Random Assignment? • RA assigns units to conditions based on chance • Not the same as random sampling – we get to this later, as an example of a validity threat or strength • Avoids correlation of causes with treatment conditions • When is randomization feasible? ETHICAL DECISION • When demand outstrips supply • When supply of X is short • When isolation or separation of experimental group is possible • Mandatory change (legislation) • No preferences • No advantages (denial of possibly beneficial service) • New organizations are created • Lotteries

Types of Experiments • The Classic Experimental Design • The Post-test Only Experimental Design • Strengths -- No test effects, no desensitization • Weaknesses -- Problems in attribution of effects, does not eliminate rival causal factors such as history or test effects, introduces test effects (!) • The Solomon Four-Group Design (Fig 8.5) • Provides estimates of test effects, avoids reactivity and test effects. • Expensive, difficult to implement, especially under field conditions • Nested, or Hierarchical Designs • Allows for identification of contextual effects • Common in school research

Natural Experiments • Natural Disasters, Policy or Legislative Changes • Examples • Flipping Coins in the Courtroom • Damage Caps • Disaster Research – Highway 880 • Waiver Laws in Adjacent Areas

Some Limitations to Experiments • Generalizability of X -- complex realities vs. single variables • Representations of theory -- e.g., the meaning of arrest • Period effects -- problems of the day, factors related to crimes or behaviors at one time may not be salient at another time (e.g., Drug eras, drug-crime relationships) • Political Limitations (e.g., over-rides) • Organizational resistance

When You Can’t Randomize: Quasi-Experiments • Theory and Logic • Adjusting for selection differences • This can be done either by design controls or statistical controls or both • No-Control Quasi-Experimental Designs • Time series before and after an intervention • Removed TX (satisfies the essentialist view of causation) • Critiques of multiple pretest observations • Test effects (sensitization, et al.) – works best if the pretest observations are unobtrusive • Change over time in status of subject vis-à-vis the preconditions for treatment

Quasi-Experimental Designs That Use Control Groups • Matched Strategies • Matched Cases – (Case Control Designs) Housing Discrimination • Matched Samples -- Bishop Waiver Study • Weaknesses and Strengths (omitted variable biases) • Difficulties and Problems with Matching • Endogeneity of Cause and Effect • Strategies for Better Matches • Use stable variables (avoid measurement errors) • Avoid confounding of matching variables with dependent variables (outcomes) • Use “deep” matches – longitudinally measured or stable variables, for example, rather than single-state variables • Statistical Solutions • instrumental variables approach • “propensity score matching” – try to model the underlying differences between experimental and control groups

Experimental Validity • Validity - whether an experiment produces “true” or “accurate” answers • Threats to internal validity • Threats posed by the design of the experiment itself -- whether the observational procedures may have produced the results. Internal validity refers to the soundness of the design to justify the conclusions reached. • Threats to external validity • Threats due to the limitations of the sample -- whether the research is generalizeable or applicable only to the population studied. In other words, it refers to the extent to which the results can be generalized.

Internal Validity Threats • History – local factors • Maturation of subjects – they change • Test Effects – subjects figure out test • Instrumentation – biased instruments • Regression to the Mean – “what goes up…” • Selection Bias I – non-equivalent groups • Mortality – subjects leave experiment • Testing Effects – you know you’re being studied • Reactivity – reactions to the researcher rather than the stimulus

External Validity Threats • Selection Bias II -- groups are unrepresentative of general populations • Multiple treatment inference -- more than one independent variable operating • Halo effects -- conferring status or label that influences behavior • Local history – changing contexts • Diffusion of treatment -- controls imitate experimental subjects • Compensatory equalization of treatment -- controls want to receive experimental treatment • Decay -- erosion of treatment • Contamination -- C's receive some of E treatment

Tradeoffs • Must we trade internal validity for external validity in experiments?

The Classic Experiment (and Its Limitations)

The Classic Experiment (and Its Limitations)

Presentation Transcript

The Heated Wire Experiment

UCLA-ATF Chicane Compressor Experiment

Eyewitness testimony and its limitations

The Transformation of Politics in Antebellum America

The Millikan Experiment

Daya Bay Experiment and its Future

Experiment

Potential limitations with the Endotrol

Universal Design for Learning and The Common Core

Cybersecurity and its limitations

The Limitations of Representative Democracy in South Asia and the way out.

The Millikan Experiment

STATUS OF THE E06-007 EXPERIMENT

The Best Classic Cocktails

The Classic Scanner and Its Benefits

ALICE: The Experiment and its Performance

Eyewitness testimony and its limitations

The Anatomy of a Great poker pkv

Statistics for Economics: Its Benefits and Limitations

Fendi: The Classic Icons!