400 likes | 408 Views
Learn the concept of sample size determination for epidemiologic studies and how to apply statistical programs like SAS. Collaborate in small groups to understand and evaluate sample size estimations.
E N D
LCS829 Wilfried Karmaus • Introduction into the determination of the sample size for epidemiologic studies • Objectives • Understand the concept • Learn to apply statistical programs to determine the sample size • Learn basic steps to use SAS • Collaborate in small groups
Objectives for sample size estimations • Level 6: Evaluation (critique, appraise, judge) • Level 5: Synthesize (integrate, collaborate) (20%) • Level 4: Analyze (hypothesis, structure) • Level 3: Application (utilize, produce) (30%) • Level 2: Comprehension (translate, discuss) • Level 1: Knowledge (define, enumerate, recall) (50%)
Validity and Precision (1) Fundamental concern: avoidance and/or control of error Error = difference between true values and study results Accuracy = lack of error Validity = lack or control of systematic error Precision = lack of random error
Validity and Precision (2) results validity actual estimator target estimator Precision Precision
Random error = part of our experience that we cannot explain/predict • How to increase precision? • · increase the size of the study • · increase the efficiency of the study
The truthabout exposure is: Data of a study without a health risk with a health risk assess the exposure as: without a health risk correct negative: 1- beta error with a health risk alpha error correct positive: 1 - What can go wrong in planning the sample size? Statistical power = 1 -
Alpha error Alpha is traditionally set to 5%, but if the risks involved in the 'exposed' group is high, then it can be set to 10%. Conversely, if the subject is not critical it could be set to 1%. Alpha = 10% means that with a little evidence the difference will be claimed significant. Alpha = 1% means that you need a very strong evidence to claim significance.
Beta error Beta is defined as 1 - power. The power of a test is a function of the difference between the groups. If the difference is Zero, the Power = Alpha. If the difference is very large, the Power = 1. That is why we must specify Beta (1-power) to be small when the difference in prevalence is small. If a study finds no difference, the power plays a major role in defining the maximum size that the true difference in the population may have. A power of 50% is a coin-flip chance to identify the true association as statistically significant.
Statistical power or guestimations of the required sample sizes • 1. What are the hypotheses? • What are reasonable scenarios for ‘true’ parameter values? Check other studies • prevalence of exposures (and confounders) • assumed differences • variance of the outcome variable
Sample Size for Surveys - Objectives of the Survey • The objectives of the study can be: • To estimate a parameter prevalence, an Odds Ratio, etc. • To test an hypothesis by comparing two (or more ) groups. • To test if a prevalence, Odds Ratio or Relative Risk is significant. For example between exposed and non exposed, a 'clean ' city versus a 'polluted' city, etc. • For each objective the sample size needed has different requirements.
Control statistical power by: • Sample size • Strength of the association • Error variance • Control for confounders or take multiple predictors into account(Use SAS, homework) Use formula/ programs homework
Sample Size for Surveys - Estimation If the objective is an estimation (prevalence or incidence), just the value is not sufficient.It must be accompanied by a confidence interval (CI). A x% CI means: If another survey is taken with the same characteristics as this, then the point estimate will be x% of the times inside this interval. The percentage is generally 95% but it can be 90% or 99%. With the confidence, we specify the width of the interval.
Estimation of the sample size to determine the prevalence and its confidence interval The formulas with the blue background are from a Super Course chapter by R. Heberto Ghezzo, McGill University.
Example: required sample size to estimate a prevalence We want to estimate the prevalence of asthma in city XXX. Since diagnosing asthma implies some tests and questionnaires, it is expensive. So we really want to use as few subjects as possible. First question: What precision do we want in our estimate? Let’s say plus/minus 2%. Second question: What is our guess of the prevalence? If we do not know it, better to err on the safe side and assume 50%.
Worst case scenario: p ~ 50%; 95% CI [48% - 52 %] n = z21-/2 * p * ( 1.- p)/( d2) n = 1.96 * 1.96 * 0.5 * ( 1.0 - 0.5) / ( 0.04 * 0.04) n = 600 From historical data or another similar city we guess the prevalence: p ~ 0.10; 95% CI [8% - 12 %], then n = 1.96 * 1.96 * 0.1* ( 1.0 - 0.1) / ( 0.04 * 0.04) n = 216 Some information is needed to estimate the sample size. The worst case scenario gives a sample size that is too large.
Sample Size for Surveys - Comparisons If the objective is a comparison of two proportions, then we have to specify the level of errors that we can tolerate and the size in absolute units of the smallest difference worth detecting. This is the smallest value that can be credibly attributed to an effect and not to noise, or just random variation. For example, a difference of 5% in the prevalence of a ‘rare’ disease, or an Odds Ratio of, for instance, OR=1.5.
Example: required sample sizes to compare 2 prevalences (p1 and p2)
Example: Sample sizes when testing 2 prevalences We want to test whether the prevalences of a disease is identical in two population with a different exposure. First we must guess which difference we intend to detect. The prevalence in the unexposed group is estimated at 10%, in the exposed group 15%. Second we need to define error rates, alpha : we can easily choose 0.05 as everybody else. beta : here we have a problem. If we are fairly sure that the prevalences will be different, then beta has almost no role in the testing procedure and we can choose a standard value of 0.20.
Example: Sample sizes when testing 2 prevalences With a power of 80% (beta =.20) we calculate: n = {1.96 * SQRT(.125*.875) + 0.842 *SQRT(.10*.90+.15*.85)}2/{.05}2 n = 686 (program SSize: n=686) If no difference is a real possibility, then we need a high power (e.g. beta= 0.05). N = 1134 (program SSize: n=1135) We need a substantially larger sample size in each group to be able to say that the difference is less than expected with only a 5 % margin of error.
Example: Sample sizes when testing an odds ratio (1)(What kind of study is this?) Is living downwind from a factory a risk factor for, say, allergic rhinitis? The general population, i.e. upwind from the factory has an exposure proportion of around 4% due to shifting wind directions etc; and a prevalence of rhinitis estimated at 15% . The downwind section has a prevalence of rhinitis estimated at 30% and an exposure of 15%. We do not know the real values—these are guesses. We are interested in an OR of 2.0 with a 95% CI from 1.5-2.5
Example: Sample sizes when testing an odds ratio (2) We do not know the proportion of exposure in the cases. Hence, first we need to compute p1 (proportion of exposure in the cases). From the formula we get: p1 = p2 * [ OR - p2 * (OR - 1)] We can calculate p2, which is the proportion of exposure in non-cases (e.g. the total population of both regions). If we sample equally from up- and down-wind sections, then we receive the same number of cases and controls from clinics in both sections of the town, the prevalence of exposure in non cases - i.e. controls - can be estimated by:
Example: Sample sizes when testing an odds ratio (3) p2 = (prop. exposed * the prop. diseased in the upwind region + prop. exposed * the prop. diseased in the downwind region) (prop. diseased upwind + prop. diseased downwind) p2= [.04 * .85 + .15 * .70 ] / [ .85 + .70 ] = 0.089 Now we calculate p1: p1 = p2 * [ OR - p2 * (OR - 1)] Based on our assumption of an OR=2.0, the estimated proportion of exposed cases is: p1 = 0.089 * [ 2 - 0.089 * ( 2 - 1)] p1 = 0.17
Example: Sample sizes when testing an odds ratio (4) Now we compute the required case sample size. n = z21-a/2*{1/[p1*(1-p1)] + 1/[p2*(1-p2)]/ ln2(1-e) n = 1.96 * 1.96 * [1/(1-0.171)+1/(0.089*0.911)]/(ln2(0.5) n = 162 Therefore we need 162 subjects with rhinitis and 162 without. To get the 162 with disease we would need, say, 500 subjects from downwind ( 500 * 0.3 = 150 cases) and 500 from upwind ( 500 * 0.04 = 20 cases) to obtain the 170 cases. There will be more than enough controls.
Example: Sample sizes when testing an odds ratio (5) Sample sizes to estimate a Relative Risk using an OR The Relative Risk can be estimated from follow-up studies. Here p1 and p2 refer to incidences, not prevalences. We need some estimate of the annual rate of disease in exposed and non exposed and to proceed as per Odds Ratio. Use PS program. Testing that the Odds Ratio is larger than 1. Testing that the OR > 1 versus OR = 1 is the same as testing p1 > p2 vs p1 = p2. So everything about testing 2 prevalences applies here.
Sample Size for Surveys - true size (1) • The formulae presented are theoretical and are based on two assumptions: • A uniform distribution of the variable in the population (no confounders), and • A knowledge of the variability of the variable in the population. • These two assumptions are never true. Therefore the calculated sample size is subject to error. • We must take this error into account in interpreting the result of the formulae. • C. We also need to consider the proportion of participation and attrition.
Sample Size for Surveys - true size (2) • As it is better to err by excess, always choose the largest value between the estimators of the population variance, and use a beta error smaller than really needed. • Thus if you err at least the estimated will have the precision required or more. • Cost will be higher, but better to spend a little more in observing more subjects than have a set of estimates that are valueless because of their large confidence intervals.
Double click on Ssize.exe • Choose the scenario that you are interested in. • a) Check one example to understand the scenario. • b) Evaluate that example. • You will see all the formulae. • c) Use the table and enter your best guesses. • 3. Go directly to ( You find it sometimes hidden at the bottom of the page.) Estimate Demonstration of SSize.exe
New approaches can take confounders or multiple risk factors into account Key idea: Central and non-central F- and Chi2 distribution Assuming the null-hypothesis is true: = 0 : In other words: We test the Null-Hypothesis. F- and Chi2 show a central distribution calculation of the p-value and 95%CI
Assuming the alternative hypothesis is true: > 0 : F- and Chi2 show a non-central distribution. calculation of the statistical power
Step 1: Outline a scenario Step 2: Estimate the sum of squares (SS) due to the hypothesis (SSH) ` Step 3: Set the -error Step 4: Calculate the power based on the non-central distribution Least square method: sum of squares about the regression line F-test = mean SSH / mean square residuals Four steps to determine the sample sizeExample: continuous outcome variable (hypertension etc.)
Step 1: Outline a scenario Outline a scenario for the planned study. Example: potential effects of maternal lead (Pb) exposure on birth weight: PB- lower smoking birth proportion of exposure SES weight the target pop. ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑--------------‑------------------ no no no 3300 gram 22.22 % no no yes 3100 gram 11.11 % no yes no 3250 gram 22.22 % no yes yes 3000 gram 11.11 % yes no no 3250 gram 22.22 % yes no yes 3000 gram 11.11 %
Step 2: Estimate the sums of squares (SS) due to the hypothesis (SSH) Step 3: Set the -error and a combined standard deviation
Step 4: Calculate the power based on the non- central distribution Reduction of the birth weight of 50 gram sample Lambda Statistical size SSH(population) DFerror Fcritical (FNC) Power N = 450 375,000.00 444 3.86 1.5 0.23 N = 600 500,000.00 594 3.86 2.0 0.29 N = 900 750,000.00 894 3.85 3.0 0.41 N =1350 1250,000.00 1344 3.85 4.5 0.56 N =1800 1500,000.00 1794 3.85 6.0 0.69 N =2400 2000,000.00 2394 3.84 8.0 0.81 N =2700 2250,000.00 2694 3.84 9.0 0.85 N =3600 3000,000.00 3594 3.84 12.0 0.93
Summary (SAS power estimation) If we can outline a scenario of potential distributions of risk factors, outcomes, and confounders, then we can use analytical statistical models for the estimation of the sample size. The advantage is that we can already control for potential distortion due to other risk factors that are not included in traditional power estimations.
Demonstration of PS.exe for time-to-event outcomes • Double click on PS.exe • Click “Overview” or “Continue” • Choose the scenario that you are interested in (e.g. Survival). • a) Follow the flow of questions (see script for homework 1) • b) Calculate the sample size and graph your results. • The sample size is for which group? • c) Change some assumptions and check the changes in the sample size in the graph.
Sample Size for Surveys - true size (3)Non-Response Non-response is a nuisance, not only because the missing responses but because the sample size gets smaller. If you surveyed 500 subjects and 400 responded, the standard error of the percentage is increased by 12 % because of the non response.
Sample Size for Surveys - true size (4)Non-Response Inputation If the non-response is small, some people suggest to impute a number for the missing responses. (cave !) This 'fabrication of data' is permissible if the only reason for it is the performance of some multivariate technique which requires complete sample. (I doubt it.) Obviously it cannot be used to reduce the error of the estimators. What are limitations for the ‘fabrication of data’???
Some planning is better than no planning. Remember that the estimated sample sizes are just approximations (best guestimations). The values used in the formulae given are estimations and subject to error. The new sample will rarely have the same prevalence as the other samples from where the assumptions were taken. Always try to err in a conservative way, i.e. estimate a sample size greater than what is really needed. Excess precision is not bad, only expensive. What if the number of cases is restricted? Low precision is not only bad but a waste of resources, work, and time. And the results obtained have little use for another application.