540 likes | 648 Views
To Add. Make it clearer that excluding variables from a model because it is not “predictive” removes all meaning from a CI since this is infinite repetitions
E N D
To Add • Make it clearer that excluding variables from a model because it is not “predictive” removes all meaning from a CI since this is infinite repetitions • Talk about stats being based on sampling variability – assumes sample is a random sample of some super-population (even if narrowly defined), but they are not a random sample, they self-select, so we couldn’t have infinite samples
Random Error I: p-values, confidence intervals, hypothesis testing, etc. Matthew Fox Advanced Epidemiology
What is a relative risk? What is a pvalue?
Which result is more precise?RR 2.0 (95% CI: 1.0 – 4.0)RR 5.0 (95% CI: 2.5 – 10.0)
RR 2.0 (95% CI: 1.0 – 4.0)What are the chances the true results is between 1.0 and 4.0?
In a randomized trial, could the finding be by change? If yes, what does it mean to be “by chance?” What is it that is caused by chance?
This Morning • Randomization • Why do we do it? • P-values • What are they? • How do we calculate them? • What do they mean? • Confidence Intervals
Last Session • Selection bias • Results from selection into our out of study related to both exposure and outcome • Structural: conditioning on common effects • Adjustment for selection proportions • Weighting for LTFU • Matching • In a case control study, creates selection bias by design, must be controlled in analysis
“There’s a certain feeling of ease and pleasure for me as a scientist that any way you slice the data, it’s statistically significant,” said Dr. Anthony S. Fauci, a top AIDS expert in the United States government, which paid most of the trial’s costs.
Randomization • Randomization lends meaning to likelihoods, p-values and confidence intervals • It can reduce the probability of severe confounding to an acceptable level • But randomization does not prevent confounding
Greenland: Randomization, statistics and causal inference • Objective: • Clarify the meaning and limitations of inferential statistics in the absence of randomization • Example — lidocaine therapy after acute MI • Patient 1: doomed • Patient 2: immune • lidocaine therapy assigned at random • two results are equally likely
Greenland: Randomization, statistics and causal inference • True RD = 0, so both possible results are confounded • Expectation = 0 = (1 + -1)/2 • Statistically unbiased (expectation equals truth) • Conclusions • Randomization does not prevent confounding • Randomization does provide a known probability distribution for the possible results under a specified hypothesis about the effect • Statistical unbiasedness of randomized exposure corresponds to an average confounding of zero over the distribution of results
Probability Theory • With an assigned probability distribution, can calculate expectation • The expectation does not have to be in the set of possible outcomes • Here, the expectation equals zero
Probability • If we randomize and assume null is true (as we do when calculating p-values) • We expect half of the subjects to be exposed and half the events to be among the exposed • If truly no effect of exposure, all data combinations, permutations are possible • Everyone was either type 1 or 4 • All the events (deaths) would occur regardless of whether assigned the exposure or not
Probability Theory • The probability of each possible data result in a 2x2 table is: • A function of the number of combinations (permutations implies order matters) • Probability of each event is number of ways to assign X subjects to exposure out of Y and A events out of a total of B total events • Assumes the margins are fixed
Fixed margins, how many parameters (cells) do I need to estimate to fill in the entire table?
Greenland: Randomization, statistics and causal inference • What comfort does this provide scientists trying to interpret a single result? • Can make probability of severe confounding small by increasing the sample size
Greenland: Randomization, statistics and causal inference • Given there were 100 cases and an even distribution of exposed and unexposed, how many cases would we expect to be exposed?
Greenland: Randomization, statistics and causal inference • What comfort does this provide scientists trying to interpret a single result? • Can make probability of severe confounding small by increasing the sample size Probability under the null that randomization would yield a result with at least as much downward confounding as the observed result
Back to the counterfactual • If association we measure differs from the truth, even if by chance, what explains it? • Unexposed can’t stand in for what would have happened to exposed had they been unexposed • This is confounding • But on average, zero confounding • This gives us a probability distribution to calculate the probability of confounding explaining the results • This is a p-value
Randomized trial of E on D in 4 patients • We find:
Randomized trial of E on D in 4 patients • If the null is true, what CST types must they be?
Hypergeometric distribution M!/ x!(M-x)! • The hypergeometric distribution: Where X = random variable, x = exposed cases, n = exposed population, M is = total cases, and N = total population.
Greenland: Randomization, statistics and causal inference • When treatment is assigned by the physician, Expectation depends on physician behavior • Expectation does not necessarily equal truth • For observational data we DON’T have probability distribution for confounding • When E isn’t randomized, statistics don’t provide valid probability statements about exposure effects because • p-values, CIs, & likelihoods calculated with assumption all data interchanges are equally likely
Greenland: Randomization, statistics and causal inference • Alternatives • Limit statistics to data description (e.g., visual summaries, tables of risks or rates, etc.) • Influence analysis: explore degree to which effect estimates would change under small perturbations of the data, such as interchanging a few subjects • Employ more elaborate statistical models • Sensitivity analysis • At the very least, interpret conventional statistics as minimum estimates of the error
(1) The p-value is: • Probability under the test hypothesis (usually the null) that a test statistic would be ≥ to its observed value, assuming no bias in data collection or analysis • Why the null? Our job is to measure • 1-sided upper p-value is test stat ≥ observed value • 1-sided lower p-value is test stat ≤ observed value • Mid-p assigns only half probability of the observation to the 1-sided upper p-value • 2-sided p-value is twice the smaller of the 1-sideds
(2) The p-value is not: • Probability that a test hypothesis (null hypothesis) is true • Calculated assuming that test hypothesis is true. • Cannot calculate probability of an event that is assumed in the calculation • Probability of observing the result under the test hypothesis (null) [likelihood] • Also includes probability of results more extreme
(3) The p-value is not: • An -level (the Type 1 error rate) • More on that later • A significance level • Used to refer to both p-values and Type 1 error rates • Should be avoided to prevent confusion
(4) The 2-sided p-value is not: • Because 2-sided p-value is twice smaller of lower and upper 1-sided p-values, which may not be same and may be > 1, it is not the: • Probability that the data would show as strong an association as observed or stronger if the null hypothesis were true; • Probability that a point estimate would be as far or further from the test value as observed
Significance testing: • Compares p-value to an arbitrary or conventional Type 1 error rate • =0.05 • Emphasizes decision making, not measurement • Derives from agricultural and industrial applications of statistics • Reflects the roots of epidemiology as the union of statistics and medicine
Response • They acknowledge the definitions were incorrect, however: • “We were not convinced that working journalists would find these definitions user-friendly, so we sacrificed precision for utility. We will add references to standard textbooks for journalists who want to learn more.”
Alternatives to pvalues • Two studies which is more precise? • RR 10.0, p = 0.039 • RR 1.3, p = 0.062 • The pvalue conflates the size of the effect and its precision • RR 10.0, p = 0.039, 95% CI: 1.5-66.7 • RR 1.3, p = 0.062, 95% CI: 0.99-1.7
Frequentist intervals (1) • Definition: • If the statistical model is correct and no bias, a confidence interval derived from a valid test will, over unlimited repetitions of the study, contain the true parameter with a frequency no less than its confidence level (e.g. 95%). • But the statistical model is only correct under randomization • CAN’T say that the probability the interval includes the truth equals the interval’s coverage probability (e.g., 95%).
Frequentist intervals (2) • Advantages • Provides more information than significance tests or p-values: direction, magnitude, and variability • Economical compared with p-value function • Disadvantages • Less information than the p-value function • Underlying assumptions (valid statistical model, no bias, repeated experiments)
How do we measure precision? • Width of the confidence interval • Measured how? • If I tell you the 95% CI for an RR is 2 to 8, can you tell me the point estimate? • Sqrt(U*L) • Difference measures, just subtract • Remember relative measures are on the log scale, so width of a CI is measured by the RATIO of the upper to the lower CI
Conclusion about confidence intervals • A CI used for hypothesis testing is an abuse of the CI • The goal is precision, not significance • The goal of epi is precision, not significance • A precise null estimate is just as important as a precise significant estimate • An imprecise, statistically significant estimate is as useless as a non-statistically significant, imprecise estimate