530 likes | 729 Views
What is a sample?. Epidemiology matters: a new introduction to methodological foundations Chapter 4. Seven steps. Define the population of interest Conceptualize and create measures of exposures and health indicators Take a sample of the population
E N D
What is a sample? Epidemiology matters: a new introduction to methodological foundations Chapter 4
Seven steps • Define the population of interest • Conceptualize and create measures of exposures and health indicators • Take a sample of the population • Estimate measures of association between exposures and health indicators of interest • Rigorously evaluate whether the association observed suggests a causal association • Assess the evidence for causes working together • Assess the extent to which the result matters, is externally valid, to other populations Epidemiology Matters – Chapter 1
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design Epidemiology Matters – Chapter 4
Why take a sample? • Epidemiologists take samples to answer health-related research questions efficiently • A full census is the epidemiologic ideal • Reasons not to take a census all the time include lack of time, lack of money, and waste of resources Epidemiology Matters – Chapter 4
To take a sample • Specify population of interest • Specify a research question of interest Epidemiology Matters – Chapter 4
Specify population of interest • What are the characteristics of the population in which we would like to understand health? • Example: Do we want to know what the prevalence of diabetes is within New York City? New York State? The United States? Do we want to know the causes of diabetes? • The population of interest has to be specified before the sampling strategy defined Epidemiology Matters – Chapter 4
Specifying a question • Question of interest can help clarify appropriate way to sample population of interest • Questions asked can includeestimating population parameters, or estimating causal effects of exposures on outcomes Epidemiology Matters – Chapter 4
Example, estimating population parameters Questions concerned with population parameters include • What proportion of individuals in the population of interest has breast cancer? • What is the mean blood pressure in the population? • How many new cases of HIV are diagnosed in the population over three years? Population parameters include estimates of • Proportions • Means • Standard deviations Sample required • Representative sample
Example, estimating causal effects of exposures on outcomes Questions for which these measures are needed are • Does exposure to pollution cause lung cancer? • Does suffering abuse in childhood cause depression in adulthood? • Does a specific genetic marker cause Alzheimer’s disease? Parameter of interest • Causal effect of an exposure on a health outcome Sampling concerns • Not representativeness (as in population parameters) • Whether individuals exposed to hypothesized cause of interest arecomparableto individuals not exposed • Purposive sample sufficient
Representative and purposive • A representative sample is one where the sample that is taken has characteristics similar to the overall population • A purposive sample selects from the population base on some criterion • A representative sample may or may not include individuals who are comparable with respect to causal identification • A purposive sample may or may not be representative of a particular population of interest Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4
How to take a representative sample • The simplest approach: a simple random sample • Each member of the population has an equal probability of being selected into the sample • A successful simple random sample should have the same basic characteristics as the original population Epidemiology Matters – Chapter 4
Taking a simple random sample • Enumerate all potential members of population of interest • Assign each member a probability of selection • Ensure selection of members are independent Epidemiology Matters – Chapter 4
Example: Sampling Farrlandia 30 residents in Farrlandia Options for random selection: --Every 4th home, dice roll for selection within home Challenges include (a) clustered exposures, (b) unequal ‘home’ size Selected for sample Epidemiology Matters – Chapter 4
Example: Sampling Farrlandia 30 residents in Farrlandia Select every Nth person in phone book Challenges include that not everyone is in phone book Selected for sample Epidemiology Matters – Chapter 4
The perfect sample? • There is no perfect sample • The goal in epidemiology is to understand limitations of sampling methods and account for them Epidemiology Matters – Chapter 4
Sampling Farrlandia Epidemiology Matters – Chapter 4
Sampling Farrlandia We want to collect our sample in such a way that the sample also has 50% exposed and 30% dotted. Epidemiology Matters – Chapter 4
Sampling Farrlandia We can use a simple random sample • ½ the population (25) • Probability of selection 1/50 or 2% • Random number generator Epidemiology Matters – Chapter 4
Sampling Farrlandia Original Population Epidemiology Matters – Chapter 4
Sampling Farrlandia Original Population Sample Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4
Quantifying sampling variability • Sampled population will not havethe exact same population parameters as complete population census • The ‘truth’, i.e., the population parameter of original population is called the true population parameter Epidemiology Matters – Chapter 4
Variations in possible samples Epidemiology Matters – Chapter 4
Variations in possible samples Epidemiology Matters – Chapter 4
Variations in possible samples Epidemiology Matters – Chapter 4
Variations in possible samples 38,760 different possible samples of 5 Epidemiology Matters – Chapter 4
Quantifying uncertainty, Central Limit Theorem (CLT) • Average proportion across all possible samples = true population proportion • Example: • 50% of true population has diabetes • Sample 1 has 100% diabetes • Sample 2 has 0% diabetes • Average of all samples will have 50% diabetes Epidemiology Matters – Chapter 4
Quantifying uncertainty, CLT • Variance around average sample proportions (standard error) p = sample proportion n = sample size Epidemiology Matters – Chapter 4
Quantifying uncertainty, CLT • Large samples will have normally distributed samples • > 30 people • No group < 5 people Epidemiology Matters – Chapter 4
Quantifying uncertainty, CLT Therefore the principal drivers of uncertainty are • Prevalence in the sample • Sample size The larger the sample size, the smaller the amount of uncertainty in the sample estimate Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4
Purposive sample • Eligibility criteria for study is the central design element; entry is based on exposure status, or sometimes on health outcome status Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4
Study design • Study design considerations are similar for representative or purposive sample • Study design reflects decisions made at one time point or over time • Timing of disease process can inform the study design Epidemiology Matters – Chapter 4
Study design options • Sample one moment in time, irrespective of disease status, measure disease and potential cause simultaneously • Sample over time, start with disease free individuals only, measure disease over time • Sample one moment in time, based on disease status Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Farrlandia population Epidemiology Matters – Chapter 4
Option 1, Cross-sectional Epidemiology Matters – Chapter 4
Option 2, Cohort Epidemiology Matters – Chapter 4
Option 3, Case-control Epidemiology Matters – Chapter 4
Why take a sample? • How to take a representative sample • Quantifying sampling variability • How to take a purposive sample • Study design • Summary Epidemiology Matters – Chapter 4