Analysis of matched data; plus, diagnostic testing

Analysis of matched data; plus, diagnostic testing

Correlated Observations • Correlated data arise when pairs or clusters of observations are related and thus are more similar to each other than to other observations in the dataset. • Ignoring correlations will: • overestimate p-values for within-person or within-cluster comparisons • underestimate p-values for between-person or between-cluster comparisons

Pair Matching: Why match? • Pairing can control for extraneous sources of variability and increase the power of a statistical test. • Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

Tonsillectomy None 41 44 33 52 Hodgkin’s Sib control Example • Johnson and Johnson (NEJM 287: 1122-1125, 1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as…. OR=1.47; chi-square=1.53 (NS) From John A. Rice, “Mathematical Statistics and Data Analysis.

Tonsillectomy None 26 15 7 37 Tonsillectomy Control Case None Example • But several letters to the editor pointed out that those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this: OR=2.14*; chi-square=2.91 (p=.09) From John A. Rice, “Mathematical Statistics and Data Analysis.

Pair Matching: example Match each MI case to an MI control based on age and gender. Ask about history of diabetes to find out if diabetes increases your risk for MI.

Diabetes No Diabetes 9 37 Just the discordant cells are informative! 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 Pair Matching: example Which cells are informative?

Diabetes No Diabetes 9 37 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 Pair Matching OR estimate comes only from discordant pairs! The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

Diabetes No Diabetes 9 37 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 P(“favors” case/discordant pair) =

Diabetes No Diabetes 9 37 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 odds(“favors” case/discordant pair) =

Diabetes No Diabetes 9 37 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 OR estimate comes only from discordant pairs!! OR= 37/16 = 2.31 Makes Sense!

Diabetes No Diabetes 9 37 16 82 MI controls MI cases Diabetes No diabetes McNemar’s Test Null hypothesis: P(“favors” case / discordant pair) = .5 (note: equivalent to OR=1.0 or cell b=cell c)

Diabetes No Diabetes 9 37 16 82 MI controls MI cases Diabetes No diabetes McNemar’s Test Null hypothesis: P(“favors” case / discordant pair) = .5 (note: equivalent to OR=1.0 or cell b=cell c) By normal approximation to binomial:

exp No exp a b c d controls cases exp No exp McNemar’s Test: generally By normal approximation to binomial: Equivalently:

Diabetes No Diabetes 9 37 16 82 MI controls MI cases Diabetes No diabetes McNemar’s Test McNemar’s Test:

Example: McNemar’s EXACT test • Split-face trial: • Researchers assigned 56 subjects to apply SPF 85 sunscreen to one side of their faces and SPF 50 to the other prior to engaging in 5 hours of outdoor sports during mid-day. The outcome is sunburn (yes/no). • Unit of observation = side of a face • Are the observations correlated? Yes. Russak JE et al. JAAD 2010; 62: 348-349.

Results ignoring correlation: Table I -- Dermatologist grading of sunburn after an average of 5 hours of skiing/snowboarding (P = .03; Fisher’s exact test) Fisher’s exact test compares the following proportions: 1/56 versus 8/56. Note that individuals are being counted twice!

Correct analysis of data: Table 1. Correct presentation of the data (P = .016; McNemar’s exact test). McNemar’s exact test: Null hypothesis: X~binomial (n=7, p=.5)

Standard error can be estimated by: Standard error of the difference of two proportions= 95% confidence interval for the difference between two proportions: RECALL: 95% confidence interval for a difference in INDEPENDENT proportions

Variance of the difference of two random variables is the sum of their variances minus 2*covariance: 95% CI for difference in dependent proportions

Diabetes No Diabetes 9 37 16 82 MI controls MI cases 46 Diabetes No diabetes 98 25 119 144 95% CI for difference in dependent proportions

The connection between McNemar and Cochran-Mantel-Haenszel Tests

Case (MI) Control 1 1 0 0 Diabetes No diabetes View each pair is it’s own “age-gender” stratum Example: Concordant for exposure (cell “a” from before)

Case (MI) Case (MI) Case (MI) Case (MI) Control Control Control Control 0 1 1 0 1 1 0 0 1 0 0 1 1 0 0 1 Diabetes Diabetes Diabetes Diabetes No diabetes No diabetes No diabetes No diabetes x 9 x 37 x 16 x 82

Mantel-Haenszel for pair-matched data We want to know the relationship between diabetes and MI controlling for age and gender (the matching variables). Mantel-Haenszel methods apply.

Case Control a b c d Exposed Not Exposed RECALL: The Mantel-Haenszel Summary Odds Ratio

Case (MI) Case (MI) Case (MI) Case (MI) Control Control Control Control x 9 1 0 1 0 1 0 1 0 0 1 1 0 1 1 0 0 x 37 Diabetes Diabetes Diabetes Diabetes No diabetes No diabetes No diabetes No diabetes x 16 x 82 ad/T = 0 bc/T=0 ad/T=1/2 bc/T=0 ad/T=0 bc/T=1/2 ad/T=0 bc/T=0

Mantel-Haenszel Summary OR

Mantel-Haenszel Test Statistic(same as McNemar’s)

Case (MI) Case (MI) Control Control 0 1 0 1 1 0 0 1 Diabetes Diabetes No diabetes No diabetes Concordant cells contribute nothing to Mantel-Haenszel statistic (observed=expected)

Case (MI) Case (MI) Control Control 0 1 1 0 1 0 1 0 Diabetes Diabetes No diabetes No diabetes Discordant cells

Example: Salmonella Outbreak in France, 1996 From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ312: 91-94; Jan 1996.

Epidemic Curve

Matched Case Control Study Case = Salmonella gastroenteritis. Community controls (1:1) matched for: • age group (< 1, 1-4, 5-14, 15-34, 35-44, 45-54, 55-64, or >= 65 years) • gender • city of residence

Results

Goat’ cheese None 23 23 6 7 Controls Cases 46 Goat’s cheese None 13 29 30 59 In 2x2 table form: any goat’s cheese

Goat’ cheese B None 8 24 2 25 Controls Cases 32 Goat’s cheese B None 27 10 49 59 In 2x2 table form: Brand A Goat’s cheese

Case (MI) Case (MI) Case (MI) Case (MI) Control Control Control Control 1 0 0 1 1 0 0 1 0 1 0 1 1 0 0 1 Brand A Brand A Brand A Brand A None None None None x8 x24 x2 x25

Using Agresti notation here! Summary: 8 concordant-exposed pairs (=strata) contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0). Summary: 25 concordant-unexposed pairs contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0).

Summary: 2 discordant “control-exposed” pairs contribute -.5 each to the numerator (observed-expected= -.5) and .25 each to the denominator (variance= .25). Summary: 24 discordant “case-exposed” pairs contribute +.5 each to the numerator (observed-expected= +.5) and .25 each to the denominator (variance= .25).

Diagnostic Testing and Screening Tests

Characteristics of a diagnostic test Sensitivity= Probability that, if you truly have the disease, the diagnostic test will catch it. Specificity=Probability that, if you truly do not have the disease, the test will register negative.

+ - + a b - c d Screening Test Truly have disease Sensitivity Specificity Calculating sensitivity and specificity from a 2x2 table a+b c+d Among those with true disease, how many test positive? Among those without the disease, how many test negative?

+ - + 9 1 - 109 881 Mammography Breast cancer ( on biopsy) Hypothetical Example 10 990 Sensitivity=9/10=.90 1 false negatives out of 10 cases Specificity= 881/990 =.89 109 false positives out of 990

What factors determine the effectiveness of screening? • The prevalence (risk) of disease. • The effectiveness of screening in preventing illness or death. • Is the test any good at detecting disease/precursor (sensitivity of the test)? • Is the test detecting a clinically relevant condition? • Is there anything we can do if disease (or pre-disease) is detected (cures, treatments)? • Does detecting and treating disease at an earlier stage really result in a better outcome? • The risks of screening, such as false positives and radiation.

Positive predictive value • The probability that if you test positive for the disease, you actually have the disease. • Depends on the characteristics of the test (sensitivity, specificity) and the prevalence of disease.

Analysis of matched data; plus, diagnostic testing