430 likes | 537 Views
Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis Scott A. Venners, Ph.D., MPH November 13, 2003. 1. PowerPoint slides available at: www.artima.com/AMU/lecture.ppt (Try tomorrow). 2. ?. Classes. First Data Set. 3. Y = Outcome Variable
E N D
Thinking About Data: A Simple Principle to Help You Improve your Scientific Data Analysis Scott A. Venners, Ph.D., MPH November 13, 2003 1
PowerPoint slides available at: www.artima.com/AMU/lecture.ppt (Try tomorrow) 2
? Classes First Data Set 3
Y = Outcome Variable X = Predictor of Interest Cov1…N = Potential Confounders (Covariates) Y = X + Cov1 + Cov2 + Cov3 + … + Cov(n) X p-value <0.05? Yes - Write a paper. 4
Simple Principle: • Your model only represents one possible explanation of data. • You must actively think of all possible alternative explanations and test them. • Those that are not testable define the uncertainty of your analysis. 5
Possible Explanations of Data (Can Test) (Cannot Test) 6
Possible Explanations of Data (Can Test) (Cannot Test) 7
Possible Explanations of Data (Cannot Test) (Can Test) 8
Possible Explanations of Data (Cannot Test) Model 9
Do not stop here! (Can Test) (Cannot Test) Model 10
Skills you need: • Thinking of possible explanations • Knowing how to test them. 11
Example 1: Simple model. Skill: Visualizing Confounding 12
Example 1: Does an inactive lifestyle increase the risk of low bone density? = Inactive Lifestyle = Active Lifestyle 13
Active Lifestyle Inactive Lifestyle 14
Active Lifestyle Inactive Lifestyle = Active Lifestyle = Inactive Lifestyle = Low Bone Density 15
Active Lifestyle Inactive Lifestyle What else could cause this result? Female, Smoking, Excessive Alcohol, Old Age… 16
Active Lifestyle Inactive Lifestyle Active Lifestyle Inactive Lifestyle Female Smoking Ex Alcohol Old Age 49% 21% 1% 30% 51% 19% 1% 50% 17
Active Lifestyle Inactive Lifestyle 30% Old Age 50% Is the association between inactive lifestyle and low bone density confounded by old age? 18
Active Lifestyle Inactive Lifestyle 30% Old Age 50% Is the association between inactive lifestyle and low bone density confounded by old age? No 19
Older Age Younger Age 50% 50% Low Bone Density Low Bone Density 30% 30% Active Inactive Active Inactive 20
Active Lifestyle Inactive Lifestyle 30% Old Age 50% Is the association between inactive lifestyle and low bone density confounded by old age? Yes 21
Older Age Younger Age 100% 100% Low Bone Density Low Bone Density 0% 0% Active Inactive Active Inactive 22
10% 30% 10% 30% Inactive Only 10 + 0(Old) + 20(Inactive) Low Bone Density Active (0) Inactive (1) Active (0) Inactive (1) Younger Age (0) Older Age (1) Independent Effect(s) 23
10% 10% 30% 30% Low Bone Density Older Age Only 10 + 20(Old) + 0(Inactive) 10% 30% 10% 30% Inactive Only 10 + 0(Old) + 20(Inactive) Low Bone Density Active (0) Inactive (1) Active (0) Inactive (1) Younger Age (0) Older Age (1) Independent Effect(s) 24
10% 10% 30% 30% Low Bone Density Older Age Only 10 + 20(Old) + 0(Inactive) 10% 30% 10% 30% Inactive Only 10 + 0(Old) + 20(Inactive) Low Bone Density Active (0) Inactive (1) Active (0) Inactive (1) Younger Age (0) Older Age (1) Independent Effect(s) 10% 30% 30% 50% Both Older Age and Inactive 10 + 20(Old) + 20(Inactive) Low Bone Density 25
Active (0) Inactive (1) Active (0) Inactive (1) Younger Age (0) Older Age (1) Independent Effect(s) 10% 30% 30% 50% Both Older Age and Inactive 10 + 20(Old) + 20(Inactive) Low Bone Density 26
Active Inactive Active Inactive Younger Age (0) Older Age (1) Independent Effect(s) 10% 30% 30% 50% Both Older Age and Inactive 10 + 20(Old) + 20(Inactive) Low Bone Density 10% 30% 30% 60% Older Age and Inactive Interaction 10 + 20(Old) + 20(Inactive) + 10(Old*Inactive) Low Bone Density 27
Example 2: Sometimes just putting potential confounders into model is not correct. 28
Example 2: Does passive smoking increase the risk of chronic cough? = Passive Smoking = No Passive Smoking 29
No Passive Smoking Passive Smoking 30
No Passive Smoking Passive Smoking 25% Cough 25% = No Passive Smoking = Passive Smoking = Chronic Cough 31
No Passive Smoking Passive Smoking 25% Cough 25% What else could cause this result? Active Smoking… 32
No Passive Smoking Passive Smoking 45% Active Smoking 17% Is the association between passive smoking and cough confounded by active smoking? 33
No Active Smoking Active Smoking 47% 47% 7% 20% Cough Cough No Passive Passive No Passive Passive 34
How to model? No Active Smoking Active Smoking 47% 47% 7% 20% Cough Cough No Passive Passive No Passive Passive 35
How to model? No Active Smoking (0) Active Smoking (1) 47% 47% 7% 20% Cough Cough No Passive (0) Passive (1) No Passive (0) Passive (1) ? Cough% = 7 + 40(Smoke) + 13(Passive) - 13(Smoke*Passive) No 36
Example 3: Sometimes explanations for data are not so clear. 37
Odds ratios of early pregnancy loss. Husband’s current smoking None <20 cigs/day >20 cigs/day Crude Adjusted* OR p OR p Ref 1.19 .429 2.18 .013 Ref 1.04 .854 1.81 .049 * Adjusted for husband and wife’s ages, education, stress, exposure to dust and noise, husband’s alcohol use, previous smoking, and exposure to toxins, and wife’s body-mass index. 38
If remove husband’s education from model: Husband’s current smoking None <20 cigs/day >=20 cigs/day Crude Adjusted* OR p OR p Ref 1.19 .429 2.18 .013 Ref 1.14 .576 2.02 .022 39
Husband’s Smoking None <20 cigs/day >=20 cigs/day 79% 59% 50% High School 40
Husband’s Smoking None <20 cigs/day >20 cigs/day 44% 29% 30% % Early Pregnancy Loss 20% 22% 21% < High School >= High School 41
Husband’s Smoking None <20 cigs/day >20 cigs/day 79% 59% 50% High School 44% 29% 30% % Early Pregnancy Loss 20% 22% 21% < High School >= High School 42
Main Points: No matter if you have good results or bad, always think beyond your preferred explanation for data. Explore all possibilities before choosing your preferred model. Acknowledge what you cannot test as your limitations. 43