320 likes | 338 Views
Delve into methods for estimating treatment effects on outcomes, exploring randomization, quasi-experimental designs, and statistical testing for treatment effects with a focus on clinical trials. Learn about Type I and II errors, power analysis, and dispersion of statistics.
E N D
Estimating the Effects of Treatment on Outcomes with Confidence Sebastian Galiani Washington University in St. Louis
Parameters of Interest • Two parameters of interest widely used in the literature: • Average Treatment Effect • Average Treatment Effect on the Treated • Under randomization and full compliance, they coincide.
Randomization • In the absence of difficulties such as noncompliance or loss to follow up, assumptions play a minor role in randomized experiments, and no role at all in randomized tests of the hypothesis of no treatment effect. • In contrast, inference in a nonrandomized experiment requires assumptions that are not at all innocuous.
Quasi-Experimental Designs • If randomization is not feasible, we need to rely on quasi-experimental methods. • In our case, the most promising strategy would be a Generalized Difference in Differences strategy.
Parameters of Interest • We might want to response the following questions: • What is the effect of the intervention on a given outcome on a given population? • What is the effect of the intervention on a given outcome on those that self-select as users of the facilities? • The power for identifying the first parameters might be lower than the power for the second parameter –identified by IV Methods.
Distance to Facilities and Sampling • Think about stratifying the sample by distance to facilities, and over-sample households residing near facilities.
Testing absence of Treatment Effects. Type I Error • Once we have chosen a Type I error rate: a, the null hypothesis (pT- pC = 0) is rejected whenever the statistics of contrast |t| > t*a/2 ; where t*a/2 is the (critical) value of t that defines the a/2 percentile of the distribution of t. No rejection Reject Null Reject Null t*a/2 -t*a/2 0
Type II Error • Now consider an alternative hypothesis: pT- pC = d • Under this alternative hypothesis, the t-statistic will have a different distribution. • If the alternative hypothesis is true, we want to reject the null hypothesis as often as possible. • To fail to do so would be a Type II error. • We want to restrict the probability of this type of error to b. • Then b will be the type II error rate of the test. • And 1-b will be the power of the statistical test. • The power of a statistical hypothesis test measures the test's ability to reject the null hypothesis when it is actually false.
Type II Error • There is a trade-off between Type I and Type II errors No rejection Reject Null Reject Null Distribution of t under the null Distribution of t under the alternative Power 1-b b t*a/2 -t*a/2 0
Type I and II Errors • Both errors can be simultaneously reduced if the dispersion of the statistics is reduced. No rejection Reject Null Reject Null Distribution of t under the null Distribution of t under the alternative Power 1-b b t*a/2 -t*a/2 0
Simple Clinical Trial • In this design m members are allocated to each condition: treatment and control. • The observed value to the i-th member in the l-th condition is a function of the grand mean and the effect of the l-th condition; any difference between the observed and the predicted value is allocated to the residual error. • The intervention effect is Cl. Its estimate is:
Simple Clinical Trial • Under the null hypothesis, H0: Cl = 0, Let estimate the variance of . • Assuming that the residual error is distributed Gaussian, the intervention effect is evaluated using a t-statistic with the appropriate df. • The researcher determines the desired Type I and II error rates (say 5% and 20%, respectively). • The researcher expects a negative intervention effect but would be concerned about a positive effect; as a result, she chooses a two-tailed test.
Simple Clinical Trial • Given the random assignment of members to treatment and control conditions, it is reasonable to assume that the two study conditions are independent. Then: • The estimated variance of a single condition mean is: • If we assign the same number of members in each condition, the variances in the two conditions are assumed to be equal.
Simple Clinical Trial • Then, the t-statistic is estimated as: • The parameters appearing in this formula are relatively easy to estimate using data from previous reports, from analyses of existing data or from preliminary studies.
Simple Clinical Trial • True type I and II errors rates will be a and b respectively if: No rejection Reject Null Reject Null HA H0 Distribution of t under the alternative Distribution of t under the null Power 1-b b -t*a/2-t*b t*a/2 -t*a/2 0
General Expression • Or: • This expression is general to any design. • We need: • Desired type I and II error rates. • The expected magnitude of the treatment effect. • The expression for the variance of the estimated treatment effect, which is a function of the sample size. • We can express any of these variables in terms of the others.
Simple Clinical Trial • Sample size:
Sample Size in GRT • Assume that there is only one individual per household. • Probability that an individual has diarrhea: • Individual i, in group k, assigned to condition l. • Within each condition: the variance of any given observation is: • Where stands for the variance within groups and for the variance between groups
Sample Size in GRT • Consider first the group mean. • If that mean were based on m independent observations, the variance of that mean would be estimated as: • However, because the members within an identifiable group almost always show positive intra-class correlation, those observations are not independent. • In fact, only the variance attributable to the individual effect will vanish as m increases. The variance attributable to the group effect will remain unaffected.
Sample Size • Then, the variance of the group mean is: • Where, m stands for the number of households per group and ICC for the intra-group correlation. • The variance of the condition mean is: • Where g is the number of groups in each condition
Sample Size • When ICC>0, the variance of the condition mean is always larger in a GRT than in a study based on random assignment of the same number of individuals to the study conditions. • Statistic of interest: • Variance of the statistic:
Sample Size • Given a moderate number of groups per condition, the t-statistic to asses the difference between condition means is: • It is distributed t-student with gT+gC-2 degrees of freedom
Sample Size • Sample size: • Number of groups per condition: • Number of household per group (it requires a couple of iterations):
Sample Size • When each household has more than one observation, we need to perform the following correction: • Where a is the number of observations per household and r is the intra-household correlation. See extreme cases r=1 or 0.
Pretest - Posttest: Repeat Observations on Groups • Data are collected in each condition before and after the intervention has been delivered in the intervention condition. • There are repeated observations on the same groups
Pretest - Posttest: Repeat Observations on Groups • The model: • The observed value for the i-th member nested within the k-th group and l-th condition and observed at the j-th time is expressed as a function of the grand mean, the effect of the l-th condition, the effect of the j-th time, the joint effect of condition and time, the realized value of the k-th group, the joint effect of group and time. • Differences between this predicted value and the observed value are allocated to the residual error
Pretest- Posttest: Repeat Observations on Groups • Treatment effect: Dif-in-dif • There are two sources of variation among the groups: • Variation due to group effect • Variation due to the interaction group x time • The first difference eliminates the first source of variation. • The group mean is: • This model can be easily transformed in the basic GRT
Pretest - Posttest: Repeat Observations on Groups • The variance of the group mean is: • Following the same steps as before… • The variance of the intervention effect can be written as: • Sample size can be solved as before
Pretest - Posttest: Repeat Observations on Members • Data are collected in each condition before and after the intervention has been delivered in the intervention condition. • There are repeated observations on the same members
Pretest- Posttest: Repeat Observations on Members • The model: • The observed value for the i-th member nested within the k-th group and l-th condition and observed at the j-th time is expressed as a function of the grand mean, the effect of the l-th condition, the effect of the j-th time, the joint effect of condition and time, the realized value of the k-th group, the realized value of the i-th member, the joint effect of group and time and the joint effect of member and time. • Differences between this predicted value and the observed value are allocated to the residual error
Pretest- Posttest: Repeat Observations on Members • Treatment effect: Dif-in-dif • There are three sources of variation among the members: • Variation due to member effect • Variation due to the interaction member x time • Error term • The first difference eliminates the first source of variation.
Pretest- Posttest: Repeat Observations on Members • Taking differences by members: • This model can be easily transformed in the basic GRT • The variance of the intervention effect can be written as: • Sample size can be solved as before