Holland on Rubin’s Model

Holland on Rubin’s Model Part II

Formalizing These Intuitions. In the 1920’s and 30’s Jerzy Neyman, a Polish statistician, developed a mathematical model that allowed him to make sense of intuitions like those I have been discussing. Neyman applied this model in the analysis of results of randomized experiments, which had only recently been invented, by R. A. Fisher, a British statistician. In the 1970’s Donald Rubin, an American statistician, expanded this model to cover the more complicated cases of non-randomized “observational” studies. I will give a brief introduction to the Neyman-Rubin model and show its connection to the ideas I have been discussing.

It is easiest to talk about experiments and observational studies where there are only two causes or treatment conditions, t and c. In this setting there is a sample of “units” (these are samples of material, people, parts of agricultural fields, etc.) each of which is “subjected to” one of the two treatment conditions, t or c. Denote by x the treatment given to a unit, xi = t or c for unit i. Later on we record the value of some outcome, yi, for unit i. So the data are pairs, (yi, xi), for each unit, i. That is all we get to observe.

Data Analysis In a situation like this, about the only thing that a sensible person knows how to do is to compute the mean value of y for those units for which xi = t and compare it to the mean value of y for those units for which xi = c. At the population level (i.e., big samples) this is a comparison of E(y| x = t) and E(y| x = c). (1) When does the difference, E(y| x = t) - E(y| x = c), (2) have an interpretation as a Causal Effect?

Return to the idea of aMinimal Ideal Comparative Experiment It has three parts • Two identical units of study • Two precisely defined and executed experimental conditions. • Precisely measured outcome observed on each unit an appropriate time after exposure to the experimental conditions. We never thought seriously that we could find “Two identical units,” but the Neyman-Rubin model substitutes this impossible idea with one that is possible to think about.

We can imagine a unit being exposed to one of two treatment conditions. If we do, then there are two Potential Outcomes that we might observe for unit i: Yti = outcome for unit if i is exposed to t, Yci = outcome for unit if i is exposed to c. Once we go this far, it is not hard to realize that the observed outcome, yi, is actually the realization of two different potential outcomes. If xi = t, then yi = Yti, and If xi = c, then yi = Yci. Thus, the observed outcome, yi, is not the simple datum it might first appear to be. This is all a result of thinking about causation and goes way beyond the data to the causal interpretation of the data.

Now let’s go back to (1) and (2) specified previously In terms of the Potential Outcomes, Yt and Yc, the difference in (2) is E(Yt| x = t) - E(Yc| x = c), (3) We are not done yet. We need to define the Average Causal Effect (ACE) of t relative to c. Since we envision every real unit having two potential values, Yti, Yci, from the two Potential Outcomes, we can certainly entertain the idea of their difference, Yti - Yci. (4) This difference is theCausal Effect of t relative to c on Y for unit i.

If we average this difference over all of the units in the population we get the Average Causal Effect (ACE) of t relative to c on y, i.e., ACE = E(Yt – Yc) (5) For reasons of simplicity, I will introduce the idea of the ACE on the treated group, or the effect of the treatment on the treated, that is, ACE(x = t) = E(Yt – Yc|x = t), (6) so that we may re-express (6) as ACE(x = t) = E(Yt|x = t) - E(Yc|x = t). (7)

Returning now to (3), we see that ACE(x = t) = E(Yt| x = t) - E(Yc| x = c) (8) + E(Yc| x = c) - E(Yc| x = t). (9) (8) is the difference between the means of the t and c groups, at the population level. But (9) involves something we can know, i.e. E(Yc| x = c) and something that is impossible to know directly, i.e., E(Yc| x = t). E(Yc| x = t) is an example of a counterfactual expected value.

Counterfactuals comes up in discussions of causation by those who don’t have a model to be very concrete about them. The difference E(Yc| x = c) - E(Yc| x = t), will be zero if E(Yc| x = c) = E(Yc| x = t). A condition that insures this is that Yc is statistically independent of x. How can this occur?

The effect of randomization. When we randomize units to treatments we use an external mechanism to assign treatments of units like the toss of a coin or a table of random numbers. This has the effect of making the assignment variable, x, statistically independent of any variable defined on the units including Yc. Thus, under random assignment, at the level of the population we have E(Yc| x = c) = E(Yc| x = t).

Hence, we have equality ACE(x = t) = E(Yt| x = t) - E(Yc| x = c), (10) or usingthe original notation E(y| x = t) - E(y| x = c)= ACE(x = t). (11) Equation (11) is very important. It shows that a causal parameter, the ACE, is equal to something that we can estimate with data, and thus do statistical inference about.

From this point of view, causal inference is statistical inference about causal parameters and not something about ethereal quantities that have little reality. Note also, that the admonition about a “cause” being something that could be a treatment in some experiment is given more force by the identity between an ACE and a difference between means that can be estimated by data.

Causal Models A causal model is an assumption about the Potential Outcomes. Here are two very common ones. Homogeneous units: Yti = Yt, and Yci = Yc , for all i, So it does not matter what i we look at, we get the same outcome under t or c. This is the basic tool of most of lab science where the units are carefully created samples of material for study.

Constant Causal Effects: Yti - Yci = k, for any i. The effect of t relative to c is the same for all units. Clearly Homogeneous Units implies Constant Causal Effect, but the converse is not necessarily true. So Constant Causal Effect is a weaker assumption than Homogeneous Units. Constant Causal Effect may be thought of as a formalization of Hume’s “constant conjunction” condition for causality. It is also the reason you learn about statistical models in which the distributions have the same shape but different means.

Without going any further, let me just end by saying that the Neyman-Rubin model can be used to illuminate any causal discussion or idea and should be part of any scientists tool kit.

Holland on Rubin’s Model