440 likes | 587 Views
Biostatistics course Part 13 Effect measures in 2 x 2 tables. Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences and Engineering University of Guanajuato Campus Celaya-Salvatierra. Biosketch. Medical Doctor by University Autonomous of Guadalajara.
E N D
Biostatistics coursePart 13Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences and Engineering University of Guanajuato Campus Celaya-Salvatierra
Biosketch • Medical Doctor by University Autonomous of Guadalajara. • Pediatrician by the Mexican Council of Certification on Pediatrics. • Postgraduate Diploma on Epidemiology, London School of Hygiene and Tropical Medicine, University of London. • Master Sciences with aim in Epidemiology, Atlantic International University. • Doctorate Sciences with aim in Epidemiology, Atlantic International University. • Associated Professor B, Department of Nursing and Obstetrics, Division of Health Sciences and Engineering, University of Guanajuato, Campus Celaya Salvatierra, Mexico. • padillawarm@gmail.com
Competencies • The reader will obtain Risk Ratio or Odds Ratio from a 2 x 2 table. • He (she) will calculate 95% confidence interval from RR or OR. • He (she) will identify potential confounders and/or interactions. • He (she) will apply Mantel Haenzsel test for RR, OR and Chi-squared.
Introduction • In part 12 of the course, we tested the association between two categorical variables. • Now, we review the methods used to measure the association. • We will work with binary variables, so we will use 2 x 2 tables.
Example • A nurse in a poor area of Mexico, was informed that many area children attending the nursery were sick of respiratory infections. • She designed a cohort study to investigate the problem. • During the following years 1000 children were followed. • The main research question was: • Attending nursery is associated with respiratory infection?
Risk Ratio (RR) • In health research, the term "risk" is used instead of proportion. • For example: • The risk of infection among children attending day care was 33.9%. • Thus, the risk ratio is the ratio of two proportions. • The risk of respiratory infection for those attending the nursery 37 / (37 + 72) = 37/109 = 0.339 • The risk of respiratory infection in children not attending day care is: 43 / (43 + 848) = 43/891 = 0.048. • The risk ratio (RR) is the ratio of these two risks. • Risk ratio = 0.339 / 0.048 = 7.06
Risk Ratio (RR) • In general, the risk ratio can be obtained with the following formula, where a, b, c and d are the frequencies in the 2 x 2 table. Risk Ratio = (a /a+b) / (c/c + d)
Odds Ratio (OR) • The Odds Ratio (OR) is the ratio of the chance (probability) of the results between those exposed and the chance of the outcome among non-exposed. • The chance of infection among attendees of the nursery is: 37 / 72 = 0,514 • The chance of infection among children not attending day care is: 43 / 848 = 0,051 • The Odds Ratio of these two probabilities: OR = 0,514 / 0,051 = 10.08 • In general, the Odds Ratio was found with the following formula: • OR = ad / bc = (a / c) / (b / d)
Confidence intervals • In the analysis of data from children attending day care or not, we have the option to use RR or OR, to measure the effect of attendance at the nursery. • Each value is an estimate only, so these values should be reported with confidence intervals. • An approximate confidence interval at 95% for the RR is found using the following formula: • Minimum value: RR / EF • Maximum value: RR x EF EF = exp(1.96√(1/a) – (1/a+b) + (1/c) –(1/c+d))
Confidence intervals • CI for the data of children who attend day care or not, is: • EF = exp (1.96 √ 1 / 37 - 1 / 109 + 1 / 43 -1/891 = 1.48 • RR = 7.06 • Minimum 7.06/1.48 = 4.77 • Maximum value 7.06 x 1.48 = 10.45 • 95% CI = 4.77 to 10.45
Confidence intervals • An approximate confidence interval at 95% for the OR is found using the following formula: • Minimum value: OR / EF • Maximum value: OR x EF EF = exp(1.96√(1/a) + (1/b) + (1/c) + (1/d))
Confidence intervals • CI for the data of children who attend day care or not, is: • EF = exp (1.96 √ 1 / 37 + 1 / 72 + 1 / 43 +1 / 848 = 1.65 • OR = 10.08 • Minimum value 10.08/1.65 = 6.11 • Maximum value 10.08 x 1.65 = 16.63 • 95% CI = 6.11 to 16.63
Which measure is best? • Risk Ratios are calculated for cross-sectional and cohort studies. • The formula for the 95% confidence interval for RR requires larger sample sizes than for OR. • OR are calculated for case-control and cross-sectional studies. • In case-control studies is not possible to calculate risks, and therefore can not calculate RR. • There is an advantage in using OR. • It is a consistent measure of effect, unlike RR.
Example (Cont…) • Mexican children showed a strong association between exposure (attending nursery) and outcome (respiratory infection). • However such an association may be confounded by other factor(s). • For example, although children who attend day care, seem to have a 7 times higher risk of respiratory infection, the cause of the infection can also be something that is associated with children who go to daycare. • In other words, to attend the nursery may be a marker of exposure that causes a respiratory infection. • If this is true, we can say that the association between respiratory infections and assistance to the nursery, are confused.
How identify a potential confounder? • To evaluate a potential confounder, we should consider three aspects: • The exposure • The outcome • The confounder
Example • The nurse is interested in the association between day care attendance and presence of respiratory infection, but is aware that children might be exposed to other factors that cause respiratory infection. • For example, overcrowding at home is a risk factor for respiratory infection. • It is therefore a potential confounder of the association between attendance at day care and respiratory infections.
Confounders • For a variable has been a potential confounding, it should meet three conditions: • Must be: • an independent risk factor for the outcome of interest • should be associated with the exposure of interest • not be in the cause pathway between exposure and outcome.
Confounders • How do we check these conditions in the study of Mexican children? • Condition 1 of confusion: • Risk factor for the outcome of interest • Is there an association between overcrowding and respiratory infection? RR = 25 95%CI = 15.72 a 39.75 X2= 311.67 P<<0.05
Confounders • How do we check these conditions in the study of Mexican children? • Condition 2 of confusion: • Association with exposure • Is there an association between overcrowding and assistance to child care? X2= 170.39 P<<0.05
Confounders • How do we check these conditions in the study of Mexican children? • Condition 3 of confusion: • Is the potential confusion is the causal pathway? • In this example, it is unlikely that child care assistance, is caused by overcrowding
Do we have a confounder? • In this study, overcrowding has satisfied the three conditions necessary for a confounding variable: • It is an independent risk factor for the outcome of interest. Overcrowding is associated with respiratory infection. • It is associated with the exposure of interest. Overcrowding is associated with attendance at the nursery. • It is not in the causal pathway. Overcrowding is unlikely to be the cause of attendance at nursery.
Stratified tables • Now, we know that the data must be additionaly analyzed for to have the effect of overcrowding. • To adjust for confounder variable, we stratified the table 2 x 2 of interest. • The table without stratify is called raw table. • Can be divided into strata defined by the confounder variable. • The sample is divided into two groups, each of them the status of overcrowding is the same. • The two groups are: • Overcrowding and without overcrowding
Stratified tables • If we want to find childcare assistance is associated with respiratory infection when comparing children within the same category of overcrowding. • The raw table for the relationship between respiratory infections and child care assistance:
Stratified tables • Now, it is show stratified tables by overcrowding and without overcrowding: Overcrowding Without overcrowding RR= 4.23 X2=32.88 p=0.0000 95%CI 1.91 a 9.37 RR= 63.6 X2=178.84 p=0.0000 95%CI 21.01 a 192.56
Stratified tables • Do you think that attendance at nursery is a risk factor for respiratory infections among children with overcrowding? • Yes, children attending day care are 63 times more at risk of respiratory infection than those who do not attend nursery. • The p value indicates a strong association between attendance at daycare and respiratory infection in the group without overcrowding.
Stratified tables • Do you think that attendance at nursery is a risk factor for respiratory infection in the group without overcrowding? • Yes, children attending day care are more than 3 times more at risk of respiratory infection than those not attending the nursery. • The p value indicates a strong association between attendance at daycare and respiratory infection in this group. • Within each stratum, the association between attendance at day care and respiratory infections is now independent of overcrowding at home.
Comparison of results • How to compare these results with those of the raw table? • The raw table shows a strong relationship between attendance at day care and respiratory infection, RR is different in both tables stratified but remains a significant statistical association.
Adjusted Risk Ratios • Nurse do not want show data divided into strata, prefer a global estimate of the effect of attended to nursery in respiratory tract infection adjusted by overcrowding. • This can be done by calculate RR using a Mantel Haenzsel method. • First, look 2 x s table in each strata.
Risk Ratios from Mantel Haenzsel • Adjusted RR (summarized), can be obtained with: Ʃ a (c+d)/n RRMantel Haenzsel = --------------- Ʃ c (a+b)/n • This give us a average of RR initially estimate into each table ; more important each table with more sample size.
Adjusted Risk Ratio • We calculate overcrowding adjusted RR with Mantel Haenzsel formula: Overcrowding Non-overcrowding 61 (5 + 21)/ 101 + 10 (4 + 861)/899 15.70 + 9.62 25.32 ------------------------------------------------ = ----------------- = ----------- = 6.56 5 (61 + 14)/101 + 4 (10 + 24)/899 3.71 + 0.15 3.86
Adjusted Odds Ratio • Adjusted OR is calculate in similar form that adjusted RR. Ʃ ad/n RMMantel Haenzel= ----------- Ʃ bc/n
Adjusted Odds Ratio • In a cross-sectional study, on the use of quinfamide after a amoebic dysentery, it was reported how many are carriers of Entamoeba histolytic.
Adjusted Odds Ratio • We calculate adjusted OR by residence area, with the Mantel Haenzsel formula: Urban Rural (35 x 51 /135) + (65 x 21/105) 13.2 + 13 26.2 ---------------------------------------- = ----------------- = ---------- = 7.4 (39 x 10 / 135) + (14 x 5 /105) 2.89 +0.67 3.56
Mantel Haenzsel X2 • The nurse now knows that the association between respiratory infection and attend to nursery still is after adjusted by overcrowding, confounder variable. • Now, she want to calculate a Chi squared test to significance of this association, adjusted by confounder. • This can be do, calculating X2Mantel-Haenzsel test.
Mantel Haenzsel X2 • To calculate adjusted Chi squared test for the confounder, we calculate Mantel Haenzsel Chi squared. Null hypothesis is that there is not association between attend to nursery and respiratory infection. Ho : OR = 1. [Ʃae-ƩE(ae)]2 X2Mantel Haenzsel= ------------------- ƩV(ae)
Mantel Haenzsel X2 • We should go, step by step, beginning with 2 x 2 of each strata.
Mantel Haenzsel X2 • Mantel Haenzsel Chi squared test is an average of individuals Chi squared of each table. • To calculate Mantel Haenzsel Chi squared test, we need three values of each table: • ae number of ill and exposed • E(ae) value expected of ae • V(ae) variance (standard error squared) of ae, • where, • E(ae) = total row x total column / grand total = (ae + be) x (ae + ce)/ne (ae + be) x (ce + de) x (ae + ce) x (be + de) V(ae) = -------------------------------------------------------- ne²(ne - 1)
Example • Overcrowding table • a = 61 • E(a) = 75 x 66 / 101 = 49.01 • V(a) = (75 x 66 x 26 x 35) / (101² x (101 - 1)) = 4.42 • Non-overcrowding table • a = 10 • E(a) = 34 x 14 / 899 = 0.53 • V(a) = 34 x 14 x 865 x 885 / (899² x (899 - 1)) = 0.50 • To obtain Mantel Haenzsel Chi squared test (adjusted Chi squared by overcrowding), we add these values from the two strata, using the formula: [Ʃae-ƩE(ae)]2 X2Mantel Haenzsel= ------------------- ƩV(ae)
Example • To obtain Mantel Haenzsel Chi squared test (Adjusted Chi squared test by overcrowding), we add these values, using the formula: a E(a) V(a) Overcrowding 61 49.01 4.42 Non-overcrowding 10 0.53 0.50 Total 71 49.54 4.92 X2Mantel-Haenzsel = (71 – 49.54)²/4.92= 93.60
Confusion or not confusion • How we decide if there is confusion? • There are nor statistical tests to demonstrate confusion. • We do calculate statistical tests and measure the effect raw and stratified tables. • Then, we calculate summarized statistical test and we compare them with the raws, and we conclude if there is confusion or not.
Confusion or not confusion • If there is an important difference between raw and adjusted estimates, we say that the association of interest is confounding by another factor. • We look the data of children that attend to nursery and respiratory infection. • After adjust by overcrowding, RR diminish from 7.06 to 6.56.
Posibles effects from confusion • Generally there are more than one confounder. • They can have different effects: • The association in study, can be or not significative before of adjust for a confounder and not significative after. • The association can be significative after adjust for a confounder but with a p-value less significative. • Strata can show oposite results and in this case, it is better, show stratified results. This is interaction or effect modified. • Confounder can hide an existing relationship.
Bibliografía • 1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173. • 2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 1-4. • 3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.