1 / 30

Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD

Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal Diseases Core Director for Biostatistics Center for Aging and Population Health

Download Presentation

Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal Diseases Core Director for Biostatistics Center for Aging and Population Health Dept. of Epidemiology, GSPH

  2. Previous Biostat DidacticsFall 2009 – Spring 2010 • Descriptive Statistics: Examining Your Data • Data types: Qualitative (Categorical), Ordinal, Quantitative • Mean, SD, medians, quartiles, IQR, skewness, histograms, boxplots • Group Comparisons: Part 1 • Normal dist (mean, SD: 68%, 95%, 99% interpretation) • t-dist, degrees of freedom (n-1) • Confidence interval for the mean • Group Comparisons: Part 2 • Comparing means: Two-sample independent t-test • pooled and unequal variance (Satterthwaite) versions • interpretation of p-values, type I (false positive) and type II error

  3. Previous Biostat DidacticsFall 2009 – Spring 2010 • Group Comparisons Part 3: Nonparametric Tests, Chi-squares and Fisher Exact • Comparing groups having small sample sizes (< 20) or with non-normal distributions => Use Wilcoxon Rank-Sum Test (nonparametric) (based on rank-order when sorted rather than on actual numeric values) • Comparing groups in the % falling into diff categories => Use Chi-square, Fisher’s Exact (if any cell n < 5)

  4. Previous Biostat DidacticsFall 2009 – Spring 2010 • Correlation, Regression and Covariate-Adjusted Group Comparisons • Pearson vs Spearman correlation => linear vs monotone association • Regression: interpretation of beta coefficients • Standard errors, p-values • Continuous predictor => beta coeff is a slope • Dichotomous (e.g. group “dummy” 0,1 valued variable) => beta coeff is difference in response vs “referent” treatment_group = 1 knockout mouse = 0 wild mouse (referent) • Adjusting for important covars when comparing groups

  5. Flow chart for group comparisons Measurements to be compared continuous discrete ( binary, nominal, ordinal with few values) Distribution approx normal or N ≥ 20? Chi-square Fisher’s Exact No Yes T-tests Non-parametrics

  6. Flow chart for regression models(includes adjusted group comparisons) Outcome variable continuous or dichotomous? continuous dichotomous Predictor variable categorical? Time-to-event available (or relevant)? No Yes (e.g. groups) No Yes Multiple linear regression ANCOVA (Multiple linear regression - using dummy variable(s) for categorical var(s) Multiple logistic regression Cox proportional hazards regression

  7. Analysis From Last Didactic … • In Health, Aging and Body Composition Knee-OA Substudy:  Examine Association between SxRxKOA (knee OA) and CRP adjusted for BMI. Motivation: • Sowers M, Hochberg M et. al. C-reactive protein as a biomarker of emergent osteoarthritis. Osteoarthritis and Cartilage Volume 10, Issue 8, August 2002, Pages 595-601 Conclusion: “CRP is highly associated with Knee OA; however, its high correlation with obesity limits its utility as an exclusive marker for knee OA”

  8. All White Females in HABC (N=844) [includes SxRxKOA (n=93); also rest of parent study cohort] N=5 had CRP > 30 (max=63.2) N=5

  9. log CRP

  10. White Females Difference in average logCRP: 0.76 – 0.43 = 0.33

  11. Two-Group Unadjusted Comparison Of Means Using Regression with Dummy-coded Groups proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run; * No OA is “referent” group (i.e. kneeOA=0) HABCID logCRP kneeOA BMI 1000 1.10972 0 22.5922 1001 0.16526 0 22.2751 1002 1.50988 0 26.1207 1003 -0.62048 0 26.9536 1014 0.65657 1 26.5266 1017 0.82039 1 30.2526 1033 0.84323 1 29.8458 1048 1.67787 1 39.8597

  12. White Females: 2-Group Comparison Using Dummy-coded Groups * No OA is “referent” group (KneeOA=0); proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1; run; “No OA” mean “kneeOA” mean difference from referent Same p-value as equal variance t-test Note: Regression using Dummy (0, 1) for group variable (e.g. KneeOA=0,1) In regression, equal (pooled) variance is assumed

  13. proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run; Model: logCRP=0.42682 + 0.33091*kneeOA (intercept) KneeOA=0  logCRP=0.42682+0.33091*0 = 0.42682 KneeOA=1  logCRP=0.42682+0.33091*1 = 0.75773

  14. ANCOVA (Analysis of Covariance)Compare logCRP adjusted for BMI 

  15. ANCOVA (Analysis of Covariance)Compare logCRP adjusted for BMI proc reg data=kneeOA_vs_noOA; model logCRP=KneeOA bmi; where female=1 and white=1; run; Unadjusted diff Was 0.33 BMI partially “explains” this difference  Note: Equal BMI slopes in each group is being modeled

  16. Notice: At any BMI level, the mean logCRP difference between KneeOA vs Not is smaller than the unadjusted difference Unadjusted Mean Difference {

  17. logCRP between KneeOA vs NotAdjusted for BMI, Ageand Anti-inflammatory Meds Note: age is not significant (caveat: narrow HABC study age range: 69-80)

  18. White Females: 2-Group Comparison Using Dummy-coded Groups * No OA is “referent” group (KneeOA=0); proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1; run; “No OA” mean “kneeOA” mean difference from referent Note: Regression using Dummy (0, 1) for group variable (e.g. KneeOA=0,1) In regression, equal (pooled) variance is assumed

  19. Pearson Correlation Pearson Correlation = a measure of linear association

  20. Pearson vs Spearman Correlation • Spearman: • A measure of rank order correlation • Works for any general trend that is increasing or • decreasing and not necessarily linear

  21. Pearson vs Spearman Correlation • Spearman: • A measure of rank order correlation • Works for any general trend that is increasing or • decreasing and not necessarily linear • Equals Pearson Correlation using the ranks of the • observations instead of actual values • Heuristically: Spearman measures degree that • low goes with low, middle with middle, high with high

  22. Effect of Centering BMI at 25 proc reg data=kneeOA_vs_noOA; model logCRP=bmi_minus25; where female=1 and white=1 and kneeOA=1; run;  logCRP= 0.58144 + 0.04699*(BMI-25) = 0.58144 at BMI=25 (see graphic)

  23. Effect of Centering BMI at 25  Model 2: logCRP= 0.58144 + 0.04699*(BMI-25) = 0.58144-25*0.04699 + 0.04699*BMI =-0.59337 + 0.04699*BMI

  24. Unadjusted Mean Difference {

  25. ANCOVA (Analysis of Covariance)Centering BMI at 25 proc reg data=kneeOA_vs_noOA; model logCRP=KneeOA bmi_minus25; where female=1 and white=1; run;  Note: Equal BMI slopes in each group is being modeled

  26. Check of ANCOVA Assumption: Equality of BMI slopes: KneeOA vs Not proc reg data=knee_vs_noOA; model logCRP=KneeOA bmi BMI_x_KneeOA; where female=1 and white=1; run;(“interaction term”) HABCID logCRP kneeOA BMI BMI_x_KneeOA 1000 1.10972 0 22.5922 0.0000 1001 0.16526 0 22.2751 0.0000 1002 1.50988 0 26.1207 0.0000 1003 -0.62048 0 26.9536 0.0000 1014 0.65657 1 26.5266 26.5266 1017 0.82039 1 30.2526 30.2526 1033 0.84323 1 29.8458 29.8458 1048 1.67787 1 39.8597 39.8597

  27. Check of ANCOVA Assumption: Equality of BMI slopes: KneeOA vs Not proc reg data=knee_vs_noOA; model logCRP=KneeOA bmi BMI_x_KneeOA; where female=1 and white=1; run; The “BMI” slopes are not signif different (p=0.8019) => they are parallel

  28. Thank you • Questions, comments, suggestions or insights? • Remaining time: Open consultation …

More Related