Statistics 101

Statistics 101

Why statistics ? • To understand studies in clinical journals. • To design and analyze clinical research studies. • To be better able to explain epidemiologic research to patients. • To answer questions on board examinations.

Types of Clinical Research Studies • Cohort: all patients have some condition or something in common (e.g., healthy and living in Framingham, MA) • Case-Control: cases have some condition; controls do not • Often an aspect of cohort study, which controls are ‘matched’ with cases for age, gender, and sometimes other variables such as date of admission or date of encounter • Randomized, placebo-controlled treatment trial: all patients have the condition • May be unblinded, single blinded or double blinded • Randomized, active-treatment controlled trial: all patients have the condition • often phase 3 trial • Meta analysis: multiple studies of same condition, although definition of the condition may vary from study to study

CONTINUOUS AGE BP CRP AST, CK, glucose, etc HEIGHT WEIGHT BMI Etc. CATEGORICAL GENDER OBESE CURE MI RACE OLD vs YOUNG Etc. Types of Variables

Basic Statistical Terms • Range: the two extreme values (min and max) • Mean: the average value (uses all values) • Median: the middle value (ignores extreme values), which divides population into two subgroups • Quartiles: divides all values into 4 groups • Tertiles, Quintiles, Percentiles • Standard deviation of the mean: measure degrees of difference among all values (uses all values) SD= ((differences from the mean2 )/n-1)

A simple example of standard deviation • d2(n-1)= 58/4=14.5 14.5 = 3.8 SD = 3.8

Serum [Na+] in 135 normals Mean, 140; median 140; range, 135-145 mM; standard deviation2

The normal (bell-shaped) distribution • Imagine 2 curves with the same mean, but different SDs ( one wider and less precise; the other narrower and more precise) • Confidence intervals will differ • Now imagine two curves with different means and standard deviations from this curve • Statistical tests are designed to tell us to what extent these different curves could have occurred by chance mean n Standard deviations (SD) from the mean. 95% of values are within 1.96 SD of mean

Some important statistical concepts • Confidence intervals (usually reported as 95% CI) • Number needed to treat (or harm) • Absolute and relative risk or benefit reductions (or increases) • 2-by-2 tables (Chi square, Fisher exact, Mantel Haenszel, others) • Odds or hazard ratios • Type 1 and 2 errors (Statistics 102) • Estimating sample size needed for a study (Statistics 102) • Pre- and post-test probabilities and likelihood ratios (Statistics 102) Ann Int Med 2009: 150: JC6-16

95% Confidence interval (CI): Example 1 H. pylori eradication/NSAID study with outcomeof ulcer or no ulcer (categorical outcome): 5 of 51 (10%, or .10) Hp+ pts. who received antibiotics got ulcers when exposed to NSAID. … and 15 of 49 (31%, or .31) Hp+ pts. who did not receive antibiotics got ulcers when exposed to NSAID. What is the chance this difference in outcome occurred due to chance and not the antibiotics? Lancet 2002; 359:9-13.

95% CIs The proportions, p1 and p2, of patients who got ulcers in the 2 groups are an estimate of the true rate. However, from this estimate we can be 95% confident that the actual rates ranges from A to B, with p1 and p2 in the center of the interval from A to B. A and B are the 95% confidence intervals. p1 A B A→B is t h e 9 5 % c o n f i d e n c e i n t e r v a l

95% Confidence interval (CI) To calculate the 95% CI for p (i.e., A and B), use this formula: p ± 1.96 [(p)(1-p)/n] The larger the n, which is in the denominator, the smaller (more precise) the CI

5 of 51 (p1=10%, or .10) of the antibiotic group got ulcers when exposed to NSAID for a fixed time • 95% CI =.10  1.96(.1)(.9)/51=.10±.08=[.02, .18] [2%,18%] 15 of 49 (p2=31%, or .31) of the placebo- group got ulcers when exposed to NSAID for a fixed time • 95%CI =.311.96(.31)(.69)/49 =.31±.13=[.18,.44][18%, 44%] Note: the two 95% CIs do not overlap, which means that differences are unlikely to be due to chance. But is the ARR significant?

Absolute risk reduction (ARR) (and its 95% CI) • The ARR with antibiotics was 31% minus 10%, or 21%. • The 95% CI of the ARR = 21%  1.96  (p1)(1-p1)/n1+(p2)(1-p2)/n2)= 21% 15%, or [6%, 36%]. • The ARR with antibiotics is somewhere between 6% and 36%, with 95% confidence. • This CI does not overlap zero and thus is unlikely due to chance.

Number needed to treat (NNT) • If Absolute Risk reduction (ARR) = 31%-10%=21%, the number needed to treat = 1/ARR = 1/.21=5. • Number needed to harm is the same concept as number needed to treat except that the intervention caused harm rather than good • e.g.: how many patients needed to be treated with antibiotics to produce one drug rash • Easy to calculate 95% CI of NNT • http://www.graphpad.com/quickcalcs/index.cfm

Example : A new protease inhibitor is tested in chronic hepatitis C, genotype 1. The new therapy (added to the standard therapy, interferon alpha/ribavirin) or standard therapy is randomly given to 200 patients for 48 weeks. Sustained viral response rates were as follows: What is the N needed to treat to achieve 1 additional SVR?

Number (n) needed to treat (NNT) 1 NNT= (SVR, NEW / # NEW) – (SVR,CONTROL / # CONTROL) NNT= 1 1  3 = (83/99) –( 50/101) .343 Note that the denominator , .343 (34.3%) , is the absolute risk reduction ( ARR). NNT= 1/ARR. Using http://www.graphpad.com/quickcalcs/index.cfm 95% CI of ARR = 0.222 to 0.465. 95% CI of NNT = 2.2 to 4.5.

RRR • Relative Risk Reduction (RRR) = ARR/risk with placebo.. • In this example, RRR= 21%/31% = 68%. • Treat 1,000 pts. with NSAID 310 ulcers (31%) • Treat 1,000 pts. with NSAID + Abs 100 ulcers (10%) • Antibiotic use prevented 210 ulcers (210/310 = 68% = RRR) • Antibiotic use reduced ulcers from 310 to 100, or to 32% of expected, a RRR of 68%. • Note: Length of exposure to NSAID in this study in the 2 groups was identical. If two groups were not followed for an identical time, often the case in trials, outcomes may be higher in the group followed longer and thus events need to be expressed per unit of time (e.g., events per 100 patient-years)

Example 2: VTE or no VTE (categorical outcome) 14 of 255 (p1=5.5%, or .055) patients with VTE switched to low-intensity warfarin developed another VTE • 95% CI = [2.6%, 8.4%] … and 37 of 253 (p2=14.6%, or .146) switched to placebo developed another VTE • 95% CI = [10.3%, 18.9%] Is this 9.1% difference in VTE likely to be due to chance? New Engl. J. Med. 2003; 348: 1425-1434

Example 3: Chi Square/Fisher Exact Tests (used for categorical outcomes) • A new treatment for colitis is compared to the standard treatment in 245 patients. • 120 patients are randomized to the new treatment and 125 to the standard treatment. • 90 given the new treatment group go into remission (75%) and 30 (25%) do not. • 75 given the standard treatment go into remission (60%) and 50 (40%) do not. • Is this a significant improvement in outcome, or to what extent could this have been due to chance? Let’s vote!

Step 1: standard 2X2 table New Rx a b a+b Standard Rx c d c+d a + c b + d a+b+c+d=n=total patients in study REMIT NO REMIT

Enter the data from our study New Rx: 90(a) 30(b) 120(a+b) Standard Rx: 75(c) 50(d) 125(c+d) 165 80 245(a+b+c+d)=n REMIT NO REMIT (a+c) (b+d)

Calculate chi square (2) by plugging in numbers into handheld or online calculator 2 = n (ad-bc- n/2)2 (a+b)(c+d)(a+c)(b+d) 2 = 6.264 (p=0.0123) http://www.graphpad.com/quickcalcs/index.cfm Fisher exact test, p=0.0143

We could also have calculated the odds ratio (OR) for a remission : New Rx a=90 b=30 Standard Rx c= 75 d=50 odds ratio = ad/bc odds ratio = 4,500/ 2,250= 2 But this odds ratio of 2 could have occurred by chance. We can calculate the 95% CI of the odds ratio to see if the CI overlaps 1 or not. If not, it favors the new treatment with >95% confidence.

95% CI of the odds ratio (OR) • ln 95% CI = ln OR  1.96 1/a+1/b+1/c+1/d • The OR = 2.00, and so the ln 2.00= 0.693 (e2.72) • Thus ln 95% CI= 0.693  0.508 = 0.185, 1.201. • To find the CI, we need the antiln of 0.185 and of 1.201. • Antiln 0.185 = e.185 =1.20; antiln 1.201 = e1.201 =3.32. •  95% CI =1.20, 3.32. • Thus, the odds ratio for a remission with the new treatment is 2.00 (95% CI= 1.20, 3.32). • As this odds ratio does not cross 1.00, the difference is unlikely due to chance and is significant at the 0.05 level.

Statistics 101

Statistics 101

Presentation Transcript

Critical Thinking 101 Statistics and Deception

Additional Slides on Bayesian Statistics for STA 101

Review of Statistics 101

Vital Statistics 101

Statistics 101 & Exploratory Data Analysis (EDA)

Statistics - Descriptive statistics

Day 2: Core statistics 101

STAT 101: Day 5 Descriptive Statistics II 1/30/12

Statistics 101 Course Notes Introduction to Quantitative Methods for

High Performance Computing Workshop (Statistics) HPC 101

Statistics 101

Statistics 101

Statistics on Statistics.

Statistics 101

DATA & STATISTICS 101

York University – 101 Vital Statistics

Statistics 101

Statistics 101

Statistics 101

Presentation Transcript

Critical Thinking 101 Statistics and Deception

Additional Slides on Bayesian Statistics for STA 101

Review of Statistics 101

Vital Statistics 101

Statistics 101 &amp; Exploratory Data Analysis (EDA)

Statistics - Descriptive statistics

Day 2: Core statistics 101

STAT 101: Day 5 Descriptive Statistics II 1/30/12

Statistics 101 Course Notes Introduction to Quantitative Methods for

High Performance Computing Workshop (Statistics) HPC 101

Statistics 101

Statistics 101

Statistics on Statistics.

Statistics 101

DATA &amp; STATISTICS 101

York University – 101 Vital Statistics

Statistics 101

Statistics 101 & Exploratory Data Analysis (EDA)

DATA & STATISTICS 101