topics in biostatistics: part 1

1. TOPICS IN BIOSTATISTICS: PART 1 Susan S. Ellenberg, Ph.D. Center for Clinical Epidemiology and Biostatistics U Penn School of Medicine

3. GENERAL APPROACH Concepts, not equations Goal is to increase awareness of statistical considerations Statistical software widely available to do basic calculations Good basic reference: Altman DG, Practical Statistics for Medical Research, Chapman & Hall/CRC

4. ESTIMATION VS TESTING Sometimes primary goal is to describe data. Then we are interested in estimation. We estimate parameters such as Means Variances Correlations When primary goal is to draw a conclusion about a state of nature or the result of an experiment, we are interested in statistical testing

5. EXPLORATORY VS CONFIRMATORY STUDIES Exploring patterns in data can be very useful, even if specific hypotheses have not been set up beforehand Such analyses can generate interesting new hypotheses; can�t generate final conclusions If you want to be able to make a definitive statement, important to specify hypothesis in very specific terms, in advance of experiment

6. TYPES OF DATA Independent: each observation from a different subject Paired: two observations (eg, before and after some intervention, left and right eyes) in same subject, or in closely related subjects (eg, siblings for genetics studies) Clustered: multiple observations on each subject When designing study and conducting analyses, need to use methods appropriate to data type

7. DESIGNING A (CONFIRMATORY) STUDY First requirement: a specific hypothesis to be tested What will be measured Criteria for �success� Usual convention: establish a �null hypothesis,� then attempt to disprove Need to be specific�should be no ambiguity about primary hypothesis Possible ambiguities not always obvious

8. EXAMPLE: STUDYING �LIQUID STITCHES� New material to apply to wound, stop bleeding Need to study how quickly and effectively bleeding is stopped Possible outcomes of interest:

9. OTHER AMBIGUITIES Given a specific hypothesis, what statistical test will be used to evaluate that hypothesis? What potentially confounding variables will be accounted for in the analysis? size of subject size of wound Other demographics: age, gender, � How will missing data be handled?

10.

12. COMMON PHRASES RELATED TO �MULTIPLE TESTING� Testing to a foregone conclusion Data dredging Torturing the data until they confess

13. SIGNIFICANCE LEVEL The significance level is also commonly referred to as The p-value The alpha level The false positive rate The Type I error It is defined as the probability of seeing an effect of a specified size just by chance�that is, if there really were no effect at all (the �null hypothesis�) We must specify a significance level when we design our experiment

14. Bell-Shaped (Normal) Curve

15. ONE-SIDED OR TWO-SIDED A one-sided (or one-tailed) test is one that looks for effects in only one direction �side� or �tail� refers to extreme end of normal (bell-shaped) curve If all of the false positive error is put into one of the tails, requirement for significance is less stringent No wide consensus among statisticians as to when one-sided tests are OK or when two-sided tests are needed

16. VIEWS ON ONE-SIDED TESTING One view: one-sided tests OK when an effect is possible in one direction only Example: a treatment to increase height Another view: one-sided tests OK anytime an effect is only of interest in one direction Example: when evaluating a new drug for regulatory approval, action is taken only if there is a positive effect: negative effects are possible but treated like zero effects

17. POWER The power of a study is the probability that it will yield a statistically significant result if there truly is an effect 1 minus power is the Type II error, or false negative rate, or beta error Power depends on The size of the effect The size of the study The false positive rate you can live with (if we declare all experiments a success, we will have 100% power but a very high false positive rate)

18. FACTS ABOUT POWER There is always some effect for which power is high, even with a small sample size For a given sample size, the power to detect and effect is higher when the effect is measured by a continuous variable (eg, lab value) than a yes-no variable (eg, mortality at day 10) One typically wants a hypothesis-testing study to have power of 80-90%

19. DETERMINING SAMPLE SIZE A study should be large enough so that if there is an effect of a size worth knowing about, the study will demonstrate the effect To calculate the sample size, need Effect size of interest Error rates we will tolerate Variability of outcome measure

20. COMPARISON OF CONTINUOUS OUTCOMES �Standardize� effect size by dividing effect size of interest to confirm, by expected SD For given power and significance level, sample size increases rapidly as the desired effect size gets smaller

21. TWO-SIDED 0.05 SIGNIFICANCE LEVEL SAMPLE SIZE BY EFFECT SIZE 0.2 0.4 0.6 0.8

22. COMPARISON OF RATES/PROPORTIONS Need larger sample sizes when trying to detect differences between proportions Reason: 0-1 data are less informative than continuous data Use binomial distribution rather than normal distribution For calculation need to specify difference of interest, expected proportion in control group, and error probabilities

23. two-sided 0.05 significance level SAMPLE SIZE BY POWER AND RATES OF INTEREST Event/success rates pwr=0.80 pwr=0.90 0.20 vs 0.40 182 236 0.40 vs 0.60 214 278 0.10 vs 0.20 438 572 0.20 vs 0.30 626 824

24. DESCRIBING DATA Two basic aspects of data Centrality variability Different measures for each Optimal measure depends on type of data being described

25. CENTRALITY Mean Sum of observed values divided by number of observations Most common measure of centrality Most informative when data follow normal distribution (bell-shaped curve) Median �middle� value: half of all observed values are smaller, half are larger Best centrality measure when data are skewed Mode Most frequently observed value

26. MEAN CAN MISLEAD Group 1 data: 1,1,1,2,3,3,5,8,20 Mean: 4.9 Median: 3 Mode: 1 Group 2 data: 1,1,1,2,3,3,5,8,10 Mean: 3.8 Median: 3 Mode: 1 When data sets are small, a single extreme observation will have great influence on mean, little or no influence on median In such cases, median is usually a more informative measure of centrality

27. TIME-TO-EVENT DATA In many experiments, outcome of interest is time to some event Death Resolution of disease/symptom First symptom manifestation Such data are typically not normally distributed; tend to follow an exponential distribution Data may be truncated (eg, all animals sacrificed at day X, so X is longest observable time) Medians typically used for such data

28. VARIABILITY Most commonly used measure to describe variability is standard deviation (SD) SD is a function of the squared differences of each observation from the mean If the mean is influenced by a single extreme observation, the SD will overstate the actual variability

29. ALTERNATIVE TO SD When using median as centrality measure, can describe variability by providing range (min, max) and interquartile range (25th and 75th percentiles) Graphical presentation often provides best sense of variability

30. EXAMPLE Group 1 data: 1,1,1,2,3,3,5,8,20 Mean: 4.9 Median: 3 Group 2 data: 1,1,1,3,3,3,5,8,10 Mean: 3.8 Median: 3 SDs: group 1: 6.1 group 2: 3.2 Interquartile range: 1,5

31. CONFIDENCE INTERVALS A confidence interval is intended to provide a sense of the variability of an estimated mean Can be defined as the set of possible values that includes, with specified probability, the true mean Confidence intervals can be constructed for any type of variable, but here we consider the most common case of a normally distributed variable

32. CONSTRUCTING A CONFIDENCE INTERVAL First, determine what level of probability should define the interval Second, find the normal value (or z-value) that corresponds to that probability 99%: 2.58 95%: 1.96 90%: 1.64 Third, multiply the z-value by the standard error of the mean

33. Bell-Shaped (Normal) Curve

34. FACTS ABOUT CONFIDENCE INTERVALS The more sure you want to be that the true value is included in your interval, the wider the interval will be A 99% confidence interval will be wider than a 95% confidence interval Most common size confidence interval is 95%, but 90% and even 80% confidence intervals are sometimes used

35. VALUE OF CONFIDENCE INTERVALS Two data sets may have the same mean; but if one data set has 5 observations and the second has 500 observations, the two means convey very different amounts of information Confidence intervals remind us how uncertain our estimate really is

36. FINAL COMMENTS Statistics are only helpful if the approach taken is appropriate to the problem at hand Most statistical procedures are based on some assumptions about the characteristics of the data�these need to be checked Remember GIGO

topics in biostatistics: part 1

topics in biostatistics: part 1

Presentation Transcript

Topics in Biostatistics Part 2

Biostatistics course Part 16 Lineal regression

Biostatistics course Part 4 Probability

Clinical Biostatistics 1

VDPAM 445 Swine Topics Part 1: Introduction

Biostatistics course Part 5 Binomial distribution

Biostatistics in Practice

Biostatistics-Lecture 1

Biostatistics in Practice

Ph.D. COURSE IN BIOSTATISTICS DAY 1

Biostatistics in Practice

Database Performance Part 1—Topics

Data Warehousing Part 1—Topics

Diverse Privacy Topics Part 1

Topics part 1

Uses of Biostatistics in Epidemiology (1)

Biostatistics in Practice

Biostatistics in Practice

Biostatistics course Part 5 Binomial distribution

Biostatistics in Practice

Biostatistics Methods – Part 2 - Edukite

Assessment Topics, Part 1