210 likes | 599 Views
Study Designs in GWAS. Jess Paulus, ScD January 30, 2013. Today’s topics. Case-control studies Population based Hospital based Nested studies Selection bias Introduction to population stratification. Genetic Association Study Design. Case-Control: Dichotomous endpoints
E N D
Study Designs in GWAS Jess Paulus, ScD January 30, 2013
Today’s topics Case-control studies Population based Hospital based Nested studies Selection bias Introduction to population stratification
Genetic Association Study Design • Case-Control: Dichotomous endpoints • Diabetes: yes versus no • Continuous or Quantitative traits • HgA1C • Family Studies
Association Study High Sample Size Family Study Low Low Heritability High High Genetic complexity Low
MR. HAPPY MR. WORRY Hierarchy of Study Designs Systematic Reviews & Meta Analysis Randomized Controlled Trials Cohort studies Case-control studies Cross-sectional studies Ecologic studies Case reports
? ? Cohort Study: Selection into study on basis of exposure status EXPOSURE OUTCOME Basis on which groups are selected at beginning of study PRESENT ABSENT
Cohort studies in genetic epidemiology • Allows study of multiple disease endpoints – extends efficiency of effort to genotype • Selection bias is generally limited
Cohort study limitations for genetic epidemiology • Loss-to-follow-up bias • Need for repeated questionnaire assessments for most up to date covariate information • Very costly and logistically challenging to genotype entire cohort and survey for disease endpoints • Due to this reason, genetic epidemiologic studies of full cohorts are rare
Case-Control: Selection based on disease status Control Exposure? Case Basis on which groups are selected at beginning of study
Case-control designs for genetic exposures • Appropriate for rare diseases, like cancer • Can be retrospective or prospective (nested case-control design) • Efficient sampling of an underlying cohort
Control selection • The biggest threat to most case-control studies • Controls must be drawn from the source population that gave rise to the cases • The ideal controls should: • Represent the exposure distribution in the source population that gave rise to the cases • Be those who, had they developed the case disease, would have been included in your study as a case • Failure to select appropriate controls generates selection bias • Selection of participants based on joint probability of exposure and outcome
Population case-control study • Cases arise from a given population, and controls are randomly sampled from that population (assuming population is enumerated) • Example: cases from CT state tumor registry, controls drawn from state census tract listings • Reduces potential for selection bias since source of controls is well-defined
Limitations of the population-based case-control study for genetic epidemiology • Lower participation rates than hospital-based studies, especially given need for biological samples • Implementation of specimen collection and processing protocols can be challenging outside a clinical setting • If interest in following participants for survival outcomes, tracing can be difficult
Hospital-based case-control study • Appropriate for genetic epidemiology studies: • Hospital setting facilitates subject enrollment and biological specimen collection and analysis • Recruitment by medical staff can aid enrollment • Smaller geographic area to cover than a population-based study – reduce processing/shipping time • Aids in collection of specimens in a timely fashion after disease diagnosis, limiting possibility for reverse causation • When cases are hospital-recruited, source population is the catchment population of the clinic • The collection of all the people who would have been notified as a case, had they developed disease
Hospital-based case-control study limitations • Retrospective nature opens door to: • Recall bias • Reverse causation • Selection bias • Selection bias in particular is a risk because it is difficult to identify the source population that gave rise to the cases • Ideal control: Who would have presented as a case to Hospital X had they in fact become ill? • Attempt to identify catchment population can be challenging • Sometimes, a control disease (sick controls) is chosen to limit potential for selection bias and differential recall of past exposure • Control illness must not be associated with the gene of interest
Nested case-control study • A type of population-based control sampling • Any case-control can be conceived as resting within a cohort of exposed and unexposed • When the cohort is very well defined this is called a nested case-control study • Sampling from within the cohort (rather than doing full cohort analysis) is usually motivated by efficiency concerns • Important applications for genetic epidemiology where it would be too costly to genotype the full cohort
Nested case-control study design advantages • Limited potential for selection bias because full cohort is enumerated and can randomly sample controls from roster • Often prospective – limits potential for gene/biomarker to be affected by disease process
Cohort sources of nested case-control studies • EPIC cohort: http://epic.iarc.fr/ • Nurses Health Study: http://www.channing.harvard.edu/nhs/ • NCI Breast and Prostate Cancer Cohort Consortium (BPC3): http://epi.grants.cancer.gov/BPC3/ • Multiethnic Cohort (MEC) study: http://www.uscnorris.com/mecgenetics/ • Alpha-Tocopherol, Beta-Carotene Cancer Prevention cohort: http://atbcstudy.cancer.gov/study_details.html • Framingham Heart Study: www.framinghamheartstudy.org
Analysis of case-control GWA studies • Univariate analysis: Pearson χ2 or Fisher exact test, Armitage trend test • Multivariate analysis: Logistic regression (if unmatched) or conditional logistic regression (if matched)