750 likes | 886 Views
Epidemiological study design. Chakrarat Pittayawonganon , MD, MPH FETP, Bureau of Epidemiology Department of Disease Control Ministry of Public Health. ทบทวนจากบทเรียนก่อน. Counts ( จำนวนนับ) , Rate ( อัตรา ), Ratio ( อัตราส่วน ), Proportion ( สัดส่วน )
E N D
Epidemiological study design Chakrarat Pittayawonganon, MD, MPH FETP, Bureau of Epidemiology Department of Disease Control Ministry of Public Health
ทบทวนจากบทเรียนก่อน • Counts (จำนวนนับ),Rate (อัตรา), Ratio (อัตราส่วน), Proportion (สัดส่วน) • ตัวตั้งกับตัวหาร การเป็น subset กัน? • Rates: Instantaneous rate (km/hr), Average rate (30 deaths/year) • Prevalence (ความชุก), Incidence (อุบัติการณ์) • มีระยะเวลาเป็นตัวกำหนด เป็นจุดเวลา / ช่วงเวลา • เป็นผู้ป่วยที่มีอยู่เดิม กับเพิ่มขึ้นใหม่ • Incidence: new cases of a disease that develop over a period of time • Prevalence: existing cases of a disease at a particular point in time or over a period of time • Cumulative incidence = Individual Risk (Incidence/Ndisease-free at start of F/U) • Problems: dynamic cohort and die from diseases other than disease of interest (competing risk)
ทบทวนจากบทเรียนก่อน • Prevalence rate (อัตราความชุก), Attack rate (อัตราป่วยเฉียบพลัน), Incidence rate (อัตราอุบัติการณ์) • กำหนดตามช่วงเวลา / จุดเวลา • ตัวหาร จำนวนประชากรเสี่ยงที่เกิดโรค / จำนวนประชากรทั้งหมด • ความสำคัญ การแปลผล และการนำไปใช้ • วิธีการให้ได้มาต่างกัน เช่น จากการเฝ้าระวังโรค หรือจากการสำรวจ • Relationship of incidence and prevalence • P = prevalence • I = Incidence • D = Duration of the disease • Attack rate =ร้อยละอัตราป่วยของประชากรที่มีภูมิไวรับเกิดป่วยเป็นโรค P = I x D
Quiz Which ones of these “rates” are true rates? ____ Attack rate ____ Incidence rate ____ Five-year survival rate ____ Infant mortality rate ____ Prevalence rate ____ Age-specific incidence rate ____ Case-fatality rate ____ Cause-specific mortality rate Confusing Risk and rate
Quiz Which ones of these “rates” are true rates? __F__ Attack rate Proportion: Case/Total N __T__ Incidence rate (IR; 0 – infinity) __F__ Five-year survival rate Proportion: Survives/Total Cases __F__ Infant mortality rate Proportion: Fatal infants/Total infants __F__ Prevalence rate Proportion: Fatal infants/Total infants __T__ Age-specific incidence rate __F__ Case-fatality rate Proportion: Fatal cases/Total Cases __T__ Cause-specific mortality rate (Deaths caused by a specific disease per 1,000 population per year)
Descriptive Studies Organize and summarize data according to time, place, and person. • Describe natural history of disease • Extent of public health problem • Identify populations at greatest risk • Allocation of health care resources • Suggest hypothesis about causation
Study Question Study Design Results Answer • ERROR • Random • Systematic • Selectionbias • Information bias TRUTH
Design tree: major epidemiologic study design Study designs Descriptive Analytic Case report Randomized Case series Non-randomized Descriptive study based on rates Quasi-experiment Prospective Cohort study Retrospective Case-control study Cross-sectional study Longitudinal study Other
What is a cohort? • Cohort: Latin word for one of the 10 divisions of a Roman legion • A group of individuals • Sharing same experience • Followed-up for a specified period of time • Examples • Birth cohort • Occupational cohort chemical plant workers • A Rapid Response Team
การประยุกต์ใช้ในสถานการณ์จริงการประยุกต์ใช้ในสถานการณ์จริง • Cohort study • จำเป็นหรือไม่ ต้องเป็นลักษณะ Follow up มีสิ่งที่บอกว่ายังไม่ป่วย และต่อมาป่วย โดยเฉพาะ Retrospective cohort study • ยกเว้น กรณีสอบสวนโรคติดเชื้อ ที่สามารถ Assume ว่ามีสถานะก่อนป่วยได้ (แตกต่างตามโรงเรียนที่สอน) • จำเป็นหรือไม่ ที่ต้องศึกษาในประชากรทั้งหมดในพื้นที่นั้นๆ • Cohort ที่ใช้ในการศึกษาสามารถศึกษาจากประชากรบางส่วนได้ ทั้งนี้ควรมีขอบเขตที่ชัดเจน ได้แก่ กลุ่มคน ห้องชั้นเรียน ตึกพัก เฉพาะช่วงเวลา • สามารถวิเคราะห์ความสัมพันธ์ระหว่าง exposure/risk กับ outcome/disease ได้ โดยแบ่งกลุ่มของผู้ที่ยังไม่ป่วยตามการมีหรือไม่มี exposure/risk ที่ศึกษา
เกร็ดเล็กเกร็ดน้อย • Disease-free does not imply healthy: incorrect to conclude that population at risk is healthy • Population at risk and a cohort: closed and open (dynamic) cohort • Closed cohort: can estimate a risk or an incidence rate (little distortion) • Period of follow-up is short enough • Competing risks are small enough in relation to disease under study • Dynamic cohort: can not directly estimate risk (new people are added in the follow-up period), however, incidence rate is suitable when precise information on the amount of period of time
Cohort studies Intuitive approach to studying disease incidence and risk factors: 1. Start with a population at risk 2. Measure characteristics at baseline 3. Follow-up the population over time with a) surveillance or b) re-examination 4. Compare event rates in people with and without characteristics of interest
Cohort studies Can be large or small Can be long or short Can be simple or elaborate Can be local or multinational For rare outcomes need many people and/or lengthy follow-up May have to decide what characteristics to measure long in advance
ill + - + exp - Prospective Cohort Study Study starts Exposure occurrence Disease occurrence + exp - Time Selection of population Prospective assessment of exposure and disease Growth-nutrition studies, Folic acid and NT defects
ill + - + + exp exp - - Prospective cohort study Exposure occurrence Study starts Disease occurrence Selection based on exposure Prospective assessment of disease Chernobyl, Industrial accidents, Flood victims
ill + - + exp - Retrospective cohort study Transversal studies Now Real Time Exposure occurrence Disease occurrence Study takes place Retrospective assessment of exposure and disease Selection based on population Food borne outbreaks, closed environment outbreaks (school, prisons, etc)
Ie+ • Ie- • a/(a+b) • c/(c+d) • = Effect measures in cohort studies • Hypothesis • Is the incidence among exposed higher than among unexposed • Absolute measures • Risk difference (RD) Ie+ - Ie- • Relative measures • Relative risk (RR) Rate ratio Risk ratio
Presentation of cohort data Population at risk Does HIV infection increase the risk of developing TB among a population of drug users? Drug users (f/u 2 years) TB Cases Incidence (%) HIV + 215 8 HIV - 289 1 Source: Selwyn et al., New York, 1989
Presentation of cohort data Population at risk Does HIV infection increase the risk of developing TB among a population of drug users? Drug users (f/u 2 years) TB Cases Incidence (%) HIV + 215 8 3.7 (8/215) HIV - 289 1 0.3 (1/289) Source: Selwyn et al., New York, 1989
Presentation of cohort data Population at risk Does HIV infection increase the risk of developing TB among a population of drug users? Drug users (f/u 2 years) TB Cases Relative risk Incidence (%) HIV + 215 8 3.7 (8/215) 12 HIV - 289 1 0.3 (1/289) Source: Selwyn et al., New York, 1989
Advantages Can measure incidence and risks Good for rare exposures Clear temporal relationship between exposure and outcome Less subject to selection bias Disadvantages Requires a large sample size Latency period Lost to follow-up Ethical considerations Resource intensive High cost Timely Advantages and disadvantages of cohortstudies
ill ill + - + - + exp - Case-Control Study Now Real Time Exposure occurred Disease occurred Study takes place Retrospective assessmentof exposure Selection based on diseasestatus
When is it desirable to conduct a case-control study? • When exposure data are expensive or difficult to obtain - Ex: Pesticide study described earlier • When disease has long induction and latent period - Ex: Cancer, cardiovascular disease
When is it desirable to conduct a case-control study? • When the disease is rare • Ex: Studying risk factors for birth defects • When little is known about the disease • Ex. Early studies of AIDS, H5 • When underlying population is dynamic • Ex: Studying breast cancer on Cape Cod
Advantages Suitable for rare diseases Can explore several exposures Low cost Rapid Can cope with long latency Small sample size No ethical problems Disadvantages Cannot calculate the risk Not suitable for rare exposures Temporal relationship difficult to establish Subject to bias Selection of controls Recall bias … Advantages and disadvantages of case-control studies
Example: Is gastro-esophageal reflux a risk factor for esophagus cancer? • How were cases selected? • Were cases representative of patients with disease? • How were controls selected? • Were controls representative of patients from source population without disease? • How were risk factors measured? • How did they minimize measurement bias for risk factors? • How were outcomes measured? • How did they minimize measurement bias for outcomes?
Case-control studies FROM SOURCE POPULATION: • Select cases with outcome (representative of cases in source population) • Select controls without outcome (same exposure distribution to RF as source population) • Hospital, clinic, neighborhood, population • Can be > 1 control per case (Increases power and face validity, and decreases selection bias) • Outcome can be disease, disability or positive outcome • Measure strength of association of RF and outcome with OR (~RR)
Two Characteristics of Cases Representativeness: Ideally, cases are a random sample of all cases of interest in the source population (e.g. from vital data, registry data). More commonly they are a selection of available cases from a medical care facility. (e.g. from hospitals, clinics) Methods of selection: Selection may be from incident or prevalent cases Incident cases are those derived from ongoing ascertainment of cases over time Prevalent cases are derived from a cross-sectional survey
Selection of Cases Population-based cases: Include all subjects or a random sample of all subjects with the disease at a single point or during a given period of time in the defined population. Hospital-based cases: All patients in a hospital department at a given time
Controls • Definition: A sample of the source population that gave rise to the cases. • Purpose: To estimate the exposure distribution in the source population that produced the cases.
Characteristics of Controls Who is the best control? Where should controls come from? If cases are a random sample of all cases in the population, then controls should be a random sample of all non-cases in the population sampled at the same time (i.e. from the same study base) But if study cases are not a random sample of the university of all cases, it is not likely that a random sample of the population of non-cases will constitute a good control population.
Three Qualities Needed in Controls • Comparability is more important than representativeness in the selection of controls • The control should be at risk of the disease • The control should resemble the case in all respects except for the presence of disease
Comparability vs. Representativeness • Usually, cases in a case-control study are not a random sample of all cases in the population. And if so, the controls must be selected in the same way (and with the same biases) as the cases. • If follows from the above, that a pool of potential controls must be defined. This is a universe of people from whom controls may be selected (study base).
Three Qualities Needed in Controls • Cases emerge within a study base. Controls should emerge from the same study base, except that they are not cases. For example, if cases are selected exclusively from hospitalized patients, controls must also be selected from hospitalized patients.
If cases must have gone through a certain ascertainment process (e.g. screening), controls must have also. (e.g. mammogram-detected breast cancer) If cases must have reached a certain age before they can become cases, so must controls. (thus we always match on age) If the exposure of interest is cumulative over time, the controls and cases must each have the same opportunity to be exposed to that exposure. (if the case has to work in a factory to be exposed to benzene, the control must also have worked where he/she could be exposed to benzene) Three Qualities Needed in Controls
Sources of controls a) Population of defined area b) Hospital patients c) Probability sample of total population d) Neighbors (i) walk (door to door) (ii) phone (random digit dialing) (iii) letter carrier routes e) Friends or associates of cases f) Siblings, spouses or other relatives g) Other
Selection of Controls General population controls: Most often used when cases are selected from a defined geographic population registries, households, telephone sampling, drivers’ license costly and time consuming recall bias eventually high non-response rate Advantages: assured that they come from the same base population as the cases Disadvantages: Time consuming, expensive, hard to contact and get cooperation; may remember exposures differently than cases
Selecting Controls Hospital controls • Used most often when cases are selected from a hospital population • Easy to identify; less recall bias; higher response rate Example: Study of cigarette smoking and myocardial infarction among women. Cases identified from admissions to hospital coronary care units. Controls drawn from surgical, orthopedic, and medical unit of same hospital. Controls included patients with musculoskeletal and abdominal disease, trauma, and other non-coronary conditions.
Hospital controls Advantages: • Same selection factors that led cases to hospital led controls to hospital • Easily identifiable and accessible (so less expensive than population-based controls) • Accuracy of exposure recall comparable to that of cases since controls are also sick Disadvantages: • More willing to participate than population-based controls • Since hospital based controls are ill, they may not accurately represent the exposure history in the population that produced the cases • Hospital catchment areas may be different for different diseases
What illnesses make good hospital controls? Those illnesses that have no relation to the risk factor(s) under study Example: • Should respiratory diseases be used as controls for a study of smoking and myocardial infarction? • Do they represent the distribution of smoking in the entire population that gave rise to the cases of MI?
Selecting Controls Special control groups like friends, spouses, siblings, and deceased individuals. • These special controls are rarely used. • Some cases are not able to nominate controls because they have few appropriate friends, are widowed, or are only or adopted children. • Dead controls are tricky to use because they are more likely than living controls to smoke and drink.
Misconception about Control Selection • Representativeness • Wrong • Of all person with diseases • Of the entire non-diseased population • Correct • the source population for the cases is the one that the controls should represent • Exposure opportunity • Not needed, as in a real follow-up study
For one control Data is expressed in a four-fold table, and an odds ratio is calculated (relative risks have no meaning here-why?) Basic Analysis OR= ad/bc
Exposure level Cases Controls OR A1 B1 OR1 High A2 OR2 B2 Medium A3 B3 OR3 Low Reference C D Not exposed Multiple Exposure Levels
Relation of Hepato cellular Adenoma to duration of oralcontraceptive use in 79 cases and 220 controls Months of OC use Cases Controls Odds ratio 0-12 7 121 13-36 11 49 37-60 20 23 61-84 21 20 >= 85 20 7 Total 79 220 Source: Rooks & col. 1979
Months of OC use Cases Controls Odds ratio 0-12 7 121 Ref. 13-36 11 49 3.9 37-60 20 23 15.0 61-84 21 20 18.1 >= 85 20 7 49.7 Total 79 220 Source: Rooks & col. 1979 Relation of Hepato cellular Adenoma to duration of oralcontraceptive use in 79 cases and 220 controls
Do you believe their results? Selection bias? Cases, controls Measurement bias? Outcomes, Risk factors Causation? • Strength of association: between exposure and illnesses • Dose response • frequency, severity, duration of symptoms • Biological plausibility: too subjective, causal/non-causal
Case-Control Studies: Biases Bias in measurement of risk factors because: • Retrospective measurement • Differential recall bias Decrease measurement bias for outcomes and RF by: • Standardize definitions, instrument and process • Train assessors • Use data recorded before outcome is known • Blinding of subject and observer • Re-analyze data with more conservative definitions
Case-Control Studies: decrease biases Decrease selection bias by: • Population based sample • Cases - registry • Controls - from same population (random digit dialing) • Sample cases and controls in same way (same clinic) so risk factors/exposure is the same • Minimize non-participants • >1 control groups (increases power and generalizability) • Matching • Case and control comparable on RF that is not interesting, or not modifiable e.g. age, gender • Advantages: Increased precision, decreased confounding • Disadvantages: Loss of data, increased time, cost, complexity, irreversible.
SIX ISSUES IN MATCHING CONTROLS, CASE-CONTROL STUDY 1. Identify the pool from which controls may come. This pool is likely to reflect the way controls were ascertained (hospital, screening test, telephone survey). 2. Control selection is usually through matching. Matching variables (e.g. age), and matching criteria (e.g. control must be within the same 5 year age group) must be set up in advance. 3. Controls can be individually matched or frequency matched