1.57k likes | 1.96k Views
Medical statistics for cardiovascular disease Part 1. Giuseppe Biondi-Zoccai , MD Sapienza University of Rome, Latina, Italy giuseppe.biondizoccai@uniroma1.it gbiondizoccai@gmail.com. Learning milestones. Key concepts Bivariate analysis Complex bivariate analysis
E N D
Medical statistics for cardiovascular diseasePart 1 Giuseppe Biondi-Zoccai, MD Sapienza University of Rome, Latina, Italy giuseppe.biondizoccai@uniroma1.it gbiondizoccai@gmail.com
Learningmilestones • Key concepts • Bivariate analysis • Complex bivariate analysis • Multivariable analysis • Specificadvancedmethods
Why do you need to know statistics? CLINICIAN RESEARCHER
The EBM 3-step approach Howanarticleshouldbeappraised, in 3 steps: Step 1 – Are the resultsof the study (internally) valid? Step 2 – What are the results? Step 3 – How can I applytheseresultstopatient care? Guyatt and Rennie, Users’ guide to the medical literature, 2002
The CochraneCollaborationRiskofBiasTool http://www.cochrane.org
The ultimate goal of any clinical or scientific observation is the appraisal of causality
Bradford Hill causalitycriteria • Force:*preciselydefined(p<0.05, weakercriterion) and with strong relative risk (≤0.83 or ≥1.20) in the absenceofmultiplicityissues (strongercriterion) • Consistency:*results in favor of the associationmustbeconfirmed in otherstudies • Temporality: expositionmust precede in a realistic fashion the event • Coherence: hypotheticalcause-effectrelationshipisnot in contrastwithotherbiologic or naturalhistoryfindings *statisticsisimportanthere Mente et al, Arch Intern Med 2009
Bradford Hill causalitycriteria • Biologicgradient:*exposition dose and riskofdisease are positively (or negatively) associated on a continuum • Experimental: experimentalevidencefromlaboratorystudies (weakercriterion) or randomizedclinicaltrials (strongercriterion) • Specificity: expositionisassociatedwith a single disease (doesnotapplytomultifactorialconditions) • Plausibility: hypotheticalcause-effectrelationshipmakessensefrom a biologic or clinicalperspective (weakercriterion) • Analogy: hypotheticalcause-effectrelationshipisbased on analogicreasoning (weakercriterion) *statisticsisimportanthere Mente et al, Arch Intern Med 2009
Randomization • Is the techniquewhichdefinesexperimentalstudies in humans (butnotonly in them), and enables the correctapplicationofstatisticaltestsofhypothesisin a frequentist framework (accordingtoRonal Fischer theory) • Randomizationmeansassigning at randoma patient (or a studyunit) tooneof the treatments • Overlargenumbers, randomizationminimizes the riskofimbalancesin patient or proceduralfeatures, butthisdoesnotholdtrueforsmallsamples and for a large set offeatures
Anyclinical or scientificcomparison can beviewedas… • A battlebetweenanunderlyinghypothesis(null, H0), statingthatthereis no meaningfuldifference or association (beyondrandomvariability) between 2 or more populationsof interest (fromwhichwe are sampling) and analternative hypothesis(H1), whichimpliesthatthereis a non-randomdifferencebetweensuchpopulations. • Anystatistical test is a test tryingto convince usthatH0is false (thusimplying the workingtruthfulnessofH1).
Falsifiability • Falsifiability or refutability of a statement, hypothesis, or theory is an inherent possibility to prove it to be false. • A statement is called falsifiable if it is possible to conceive an observation or an argument which proves the statement in question to be false. • In this sense, falsify is synonymous with nullify, meaning not "to commit fraud" but "show to be false
Statistical or clinicalsignificance? • Statisticalandclinicalsignificanceare 2 verydifferentconcepts. • A clinicallysignificantdifference, ifdemostratedbeyond the play of chance, isclinicallyrelevant and thusmeritssubsequentaction (ifcosts and tolerabilityissues are notovercoming). • A statisticallysignificantdifferenceis a probabilisticconcept and shouldbeviewed in light of the distancefrom the nullhypothesis and the chosensignificancethreshold.
Descriptive statistics 100 100 AVERAGE
Inferential statistics If I become a scaffolder, how likely I am to eat well every day? Confidence Intervals P values
Samples and populations This is a sample
Samples and populations And this is its universalpopulation
Samples and populations This is anothersample
Samples and populations And thismightbeits universalpopulation
Samples and populations Butwhatif THIS is its universalpopulation?
Samples and populations Anyinferencethus depend on ourconfidence in itslikelihood
Alpha and type I error Whenever I perform a test, there is thus a riskof a FALSE POSITIVE result, ie REJECTING A TRUE nullhypothesis. Thiserror is calledtype I,is measuredasalphaand itsunit is the p value. The lower the p value, the lower the riskoffallinginto a type I error (ie the HIGHER the SPECIFICITY of the test).
Alpha and type I error Type I erroris like a MIRAGE Because I seesomething thatdoes NOT exist
Beta and type II error Whenever I perform a test, thereisalso a riskof a FALSE NEGATIVE result, ie NOT REJECTING A FALSE nullhypothesis. Thiserroriscalledtype II, ismeasuredasbeta, and itsunitis a probability. The complementaryof beta iscalledpower. The lower the beta, the lower the riskofmissing a truedifference (ie the HIGHER the SENSITIVITY of the test).
Beta and type II error Type II erroris likebeingBLIND Because I do NOT seesomethingthatexists
Accuracy and precision true value measurement distance spread Accuracy measures the distance from the true value Precision measures the spead in the measurements
Accuracy and precision • Thus: • Precisionexpresses the extentofRANDOM ERROR • Accuracyexpresses the extentofSYSTEMATIC ERROR (iebias)
Validity Internal validity entails both PRECISION and ACCURACY (ie does a study provide a truthful answer to the research question?) External validity expresses the extent to which the results can be applied to other contexts and settings. It corresponds to the distinction between SAMPLE and POPULATION)
Intention-to-treat analysis • Intention-to-treat (ITT) analysis is an analysis based on the initial treatment intent, irrespectively of the treatment eventually administered. • ITT analysis is intended to avoid various types of bias that can arise in intervention research, especially procedural, compliance and survivor bias. • However, ITT dilutes the power to achieve statistically and clinically significant differences, especially as drop-in and drop-out rates rise.
Per-protocol analysis • In contrast to the ITT analysis, the per-protocol (PP) analysis includes only those patients who complete the entire clinical trial or other particular procedure(s), or have complete data. • In PP analysis each patient is categorized according to the actual treatment received, and not according to the originally intended treatment assignment. • PP analysis is largely prone to bias, and is useful almost only in equivalence or non-inferiority studies.
ITT vs PP 45 ptstreatedwith A, 5 shiftedto B becauseofpoor global health (all 5 died) 50 ptstogroup A (more toxic) 100 ptsenrolled RANDOMIZATION ACTUAL THERAPY 50 ptstogroup B (conventionalRx, lesstoxic) 50 patientstreatedwith B (none died)
ITT vs PP 45 ptstreatedwith A, 5 shiftedto B becauseofpoor global health (all 5 died) 50 ptstogroup A (more toxic) • ITT: 10% mortality in group A vs 0% in group B, p=0.021 in favor of B 100 ptsenrolled RANDOMIZATION ACTUAL THERAPY 50 ptstogroup B (conventionalRx, lesstoxic) 50 patientstreatedwith B (none died)
ITT vs PP 45 ptstreatedwith A, 5 shiftedto B becauseofpoor global health (all 5 died) 50 ptstogroup A (more toxic) • ITT: 10% mortality in group A vs 0% in group B, p=0.021 in favor of B • PP: 0% (0/45) mortality in group A vs 9.1% (5/55) in group B, p=0.038 in favor of A 100 ptsenrolled RANDOMIZATION ACTUAL THERAPY 50 ptstogroup B (conventionalRx, lesstoxic) 50 patientstreatedwith B (none died)
Mean (arithmetic) • Characteristics: • -summarises information well • -discards a lot of information(dispersion??) • Assumptions: • -data are not skewed • distorts the mean • outliers make the mean very different • -Measured on measurement scale • cannot find mean of a categorical measure • ‘average’ stent diameter may be meaningless
Median • What is it? • The one in the middle • Place values in order • Median is central • Definition: • Equally distant from all other values • Used for: • Ordinal data • Skewed data / outliers
- 2 ( x x ) S = SD - N 1 Standard deviation • Standard deviation (SD): • approximates population σ • as N increases • Advantages: • with mean enables powerful synthesis • mean±1*SD 68% of data • mean±2*SD 95% of data (1.96) • mean±3*SD 99% of data (2.86) • Disadvantages: • is based on normal assumptions Variance
Interquartile range • 25% to 75% percentile • or • 1° to 3° quartile 16.5 1st-3rd Quartile =16.5; 23.5 Interquartile Range =23.5-16.5=7.0 Median 23.5
Coefficient of variation Standard deviation Mean CV = x 100 Coefficientofvariation(CV) is a indexof relative variability CVisdimensionless CV enablesyouto compare data dispersionofvariableswithdifferentunitsofmeasurement
Learningmilestones • Key concepts • Bivariate analysis • Complex bivariate analysis • Multivariable analysis • Specificadvancedmethods
Point estimation & confidence intervals • Usingsummarystatistics(mean and standard deviationfornormalvariables, or proportionforcategoricalvariable) and factoring sample size, we can buildconfidenceintervalsor test hypothesesthatwe are samplingfrom a givenpopulation or not • This can bedonebycreating a powerfultool, whichweighsourdispersionmeasuresbymeansof the sample size: the standard error
First youneed the SE • We can easily build the standard error of a proportion, according to the following formula: • Where variance=P*(1-P) and n is the sample size P * (1-P) SE = n
Point estimation & confidence intervals • We can then create a simple test tocheckwhether the summary estimate wehavefound can becompatibleaccordingtorandomvariationwith the correspondingreferencepopulationmean • The Z test (when the population SD isknown) and the t test (when the population SD isonlyestimated), are thusused, and both can beviewedas a signaltonoiseratio
Signal to noise ratio Signal = Signal to noise ratio Noise
From the Z test… Signal = Signal to noise ratio Noise Absolute difference in summary estimates Z score = Standard error Results of z score correspond to a distinct tail probability of the Gaussian curve (eg 1.96 corresponds to a 0.025 one-tailed probability or 0.050 two-tailed probability)
95% means that, if we repeat the study 20 times, 19 times out of 20 we will included the true population average n …toconfidenceintervals Standard error (SE or SEM) can be used to test a hypothesis or create a confidence interval (CI) around a mean for a continuous variable (egmortality rate) 95% CI = mean ± 2 SE SD SE =
Ps and confidence intervals P values and confidence intervals are strictly connected Any hypothesis test providing a significant result (egp=0.045) means that we can be confident at 95.5% that the population average difference lies far from zero (ie the null hypothesis)
P values and confidenceintervals Importantdifference Trivialdifference Ho significant difference (p<0.05) non significant difference (p>0.05)
Power and sample size Wheneverdesigning a study or analyzing a dataset, it is importantto estimate the sample size or the powerof the comparison. SAMPLE SIZE Setting a specificalphaand a specificbeta, youcalculate the necessary sample sizegiven the averageinter-groupdifferenceand itsvariation. POWER Given a specificsample sizeandalpha, in light of the calculatedaverageinter-groupdifferenceand itsvariation, youobtainan estimate of the power (ie 1-beta).
Hierachyofanalysis • A statisticalanalysis: • Univariate (e.g. whendescribingmean or standard deviation) • Bivariate (e.g. whencomparingage in menadn women) • Multivariable (e.g. whenappraisinghowage and gender impact on the riskofdeath) • Multivariate (e.g. whenappraisinghowage and gender simultaneously impact on riskofdeath and hospital costs)