1 / 42

Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010

Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010. Janet Dunn Louise Hiller. Data types. Data types. 2-level categorical (binary) data. Frequency Table. Variable 1. Variable 2. 2-level categorical (binary) data - Test of association.

haig
Download Presentation

Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30th September 2010 Janet Dunn Louise Hiller

  2. Data types

  3. Data types

  4. 2-level categorical (binary) data Frequency Table Variable 1 Variable 2

  5. 2-level categorical (binary) data - Test of association Null hypothesis: The 2 factors are independent Chi-squared test, with continuity correctionc2=11.4 p=0.0007  Treatment and gender are NOT independent Treatment Gender

  6. 2-level categorical (binary) data - Test of association Null hypothesis: The 2 factors are independent Commonly used with small numbers, Fisher’s exact testp=0.51  Treatment and gender are independent Treatment Gender

  7. 2-level categorical (binary) data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Kappak=0.71 (95%CI 0.60-0.83) • There is good agreement between reviewers Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor 0.21 - 0.40 fair 0.41 - 0.60 moderate 0.61 - 0.80 good 0.81 - 1.00 very good

  8. 2-level categorical (binary) data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Kappak=-0.04 (95%CI -0.24 - 0.15) • There is poor agreement between reviewers Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor 0.21 - 0.40 fair 0.41 - 0.60 moderate 0.61 - 0.80 good 0.81 - 1.00 very good

  9. 2-level categorical (binary) data – Exploring patterns in the data Response Odds ratio (OR): the ratio of the odds of an event occurring in the 1stgp to the odds of it occurring in the 2ndgp OR=1 - event is equally likely to occur in both gps OR>1 - event is more likely to occur in 1stgp OR<1 - event is less likely to occur in 1stgp OR=4.1 (95%CI 2.2-7.9)  The odds of a male having a response are 4 times those of a female having a response Gender

  10. 2-level categorical (binary) data – Exploring patterns in the data SAE suffered Relative Risk (RR): the ratio of the risk of an event occurring in the 1stgp to the risk of it occurring in the 2ndgp RR=1 - event is equally likely to occurin both gps RR>1 - event is more likely to occur in 1stgp RR<1 - event is less likely to occur in 1stgp RR=1.7 (95%CI 0.64-4.50)  New trt patients are 1.7 times more likely to suffer an SAE than control patients Treatment

  11. Odds Ratio/Relative Risk plots 0.5 2

  12. Exploring patterns in multivariate data - Logistic Regression • A statistical modelling method that describes the relationship between a categorical response variable and 1 or more categorical and/or continuous variables e.g. Association between bearing grudges & medical conditions

  13. Ordered categorical data – Test for trend Null hypothesis: No linear trend between groups Chi-squared tests for trendc2=10.8 p=0.001  There is a linear trend between groups Treatment Toxicity

  14. Ordered categorical data – Test for trend (>2 rows & columns) Null hypothesis: No linear trend between rows and columns Chi-squared tests for trendc2=7.1 p=0.008  There is a linear trend between rows & columns Treatment dose Toxicity

  15. Ordered categorical data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor 0.21 - 0.40 fair 0.41 - 0.60 moderate 0.61 - 0.80 good 0.81 - 1.00 very good Weighted kappak=0.38 (95%CI 0.27-0.49)  There is fair agreement between reviewers

  16. Non-ordered categorical data - Test of association Null hypothesis: The 2 factors are independent Chi-squared testc2=0.51 p=0.78  Treatment and disease site are independent Treatment Disease site

  17. Non-ordered categorical data – Measure of agreement A measure of agreement between reviewers, above that expected by chance Reviewer 1 Reviewer 2 Altman guidelines <0.20 poor 0.21 - 0.40 fair 0.41 - 0.60 moderate 0.61 - 0.80 good 0.81 - 1.00 very good Kappak=0.31 (95%CI 0.20-0.42)  There is fair agreement between reviewers

  18. Categorical data – RECAP.

  19. Data types

  20. Normally distributed data • Data forms a bell-shaped curve • Non-significant Shapiro-Wilk test result

  21. Mean & Standard Deviation graph Treatments Change over time in QOL (%)

  22. Parametric tests • Differences between means of 2 groups • T-tests • Differences between means of >2 groups • ANOVA • Linear regression • Correlation • Pearson’s correlation coefficient, r

  23. Non-normally distributed data

  24. Box and Whisker graphs • Outliers (observations that lie outside of the 95% CIs) are sometimes plotted individually

  25. Box and Whisker graphs • Parallel box plots show the differences between groups

  26. Non-parametric tests • Differences between medians of 2 groups • Wilcoxon rank sum test • Differences between medians of >2 groups • Kruskal-Wallis 1-way analysis of variance test • Correlation • Spearman’s rank order correlation coefficient, r

  27. Transforming data • Can transform non-normally distributed data (e.g. logarithm, square root,reciprocal) to make create normally distributed data • Then analysetransformed data using parametric methods

  28. Data types

  29. Time-to-event data • Why is this different to other continuous data? • Censoring TNO 1 2 3 4 5 6 Time 20* 8 8* 14 1* 16* KEY Randomisation date Date of event Censor date

  30. What time? What event? • Start date? • Diagnosis • Surgery • Event? • Onset / worsening of pain • Hospital discharge • Death (OS) • Relapse (RFI/DFI/ Plateau) • Relapse or death (RFS/DFS) You need to know what you’re looking at to know how to interpret it / what to compare it to • Randomisation • Start/End of treatment

  31. Time-to-event data analysis (‘Survival Analysis’) • Can be used to measure time to any event • Arthritic joint remaining pain-free post steroid injections • Elderly patient with a fractured hip remaining in hosp. • Calculate ‘survival’ time for each patient (some may be censored times) • Recruitment takes place over time so varying lengths of follow-up are expected • Rank these times and calculate proportions alive at certain points, with due allowance for incomplete follow-up • These proportions and times are plotted and overall distributions of curves compared

  32. Time-to-event data • Why is this different to other continuous data? • Censoring TNO 1 2 3 4 5 6 Time 20* 8 8* 14 1* 16* KEY Randomisation date Date of event Censor date

  33. Kaplan-Meier Curves Minimum & median FU indicate the maturity of the data Median survival = 1.3 years

  34. Kaplan-Meier Curves 84% 78% Numbers at Risk: ECMF 1189 1171 1120 1073 1020 965 826 606 380 196 53 CMF 1202 1178 1099 1024 957 888 759 564 352 176 55

  35. Undesirable comparisons of survival rates

  36. Statistical tests for time-to-event data • Log-rank tests compare the overall distributions of the curves (c2 and p-value presented) • Null hypothesis: all curves are samples from populations with the same risk of the event • Compares the number of deaths observed on each treatment arm with the number expected under the null hypothesis that the 2 survival distributions are identical • Cox proportional hazards model (Hazard Ratio, 95% CI’s and p-value presented) • Identifies which variables from a group of several are independently related to survival • In what order of importance • Gives you a measure of their relation to survival

  37. Forest plots [Bars=95% confidence interval. Size of boxes can represent sample size]

  38. Longitudinal data analysis • A variable can be measured on the same patient over time (e.g. Baseline, 3 month, 6 month …) • Can be any type of data (categorical, continuous)

  39. Longitudinal data analysis – Summary Measures Change from Baseline in Global QOL Improvement Deterioration TRT A TRT B CMF ECMF Change at 1 year (p=0.01) Change at 2 years (p=0.06)

  40. Longitudinal data analysis – Modelling Pulmonary function (TLCO score) over time Graphs show each patient as a separate line Solid line = Trt A pts Dashed line = Trt B pts Random effects modelling predicts the average patient score on each treatment arm

  41. Cluster Randomised Trial data • Patients within 1 cluster are often more likely to respond in a similar manner, and thuscan not be assumed to act independently • ICC = IntraclusterCorrelation Coefficient. A statistical measureof this dependence • Takes values between 0 and 1 • Higher values = greater between-cluster variation. e.g. Management within sites are consistent but, across different sites, there is wide variation • Analysis must incorporate the effects of clustering i.e. the values of the ICC and design effect

  42. Useful References • Gore & Altman – Statistics in Practice • Bland - An Introduction to Medical Statistics • Altman - Practical Statistics for Medical Research • Peto et al - Design and Analysis of Randomized Clinical-Trials Requiring Prolonged Observation of each patient • 1/ Introduction and Design. British Journal of Cancer 1976. 34(6) 585-612   • 2/ Analysis and Examples. British Journal of Cancer 1977. 35(1) 1-39

More Related