420 likes | 567 Views
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates. Prevalence Proportions Percentages. Mapping Rates and Proportions. Sample Data: Breast Cancer Incidence in Iowa. Years: 1993-1996 7813 Cases (including in-situ) For each case: Age, county
E N D
Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages Mapping Rates and Proportions
Sample Data:Breast Cancer Incidence in Iowa • Years: 1993-1996 • 7813 Cases (including in-situ) • For each case: Age, county • Source: State Health Registry of Iowa
Geography and Population • 99 counties • Number of women: 1,061,096 (ages 20+) • Population available for each county by age group. • Age groups: 20-24, 25-29, …, 80-84, 85+ • Source: 1990 census
Poisson Data Numerator: Number of events over time, such as incidence or mortality cancer cases. Denominator: Population years at risk.
Rates and Relative Risks c = # cancer cases in e.g. a county n = county population C = # cancer cases in e.g. a state N = state population Rate = c/n Relative Risk =
Bernoulli Data (0/1 data) Individual people with one of two traits, such as cancer vs. no cancer, late vs. early disease or two different treatments. Numerator: The trait of interest. Denominator: All individuals. The denominator may be a complete count or a random sample.
Proportions and Relative Risks c = # late stage cancer cases in a county n = total number of cases C = # late stage cancer cases in state N = total cases in state Crude Rate = c/n Crude Relative Risk =
The statistical methods used are slightly different for Poisson and Bernoulli data, but in terms of mapping, the principles are the same.
Age Adjustment • Indirect vs. Direct Standardization • Internal vs. External Standard • Relative Risk vs. Rate
Age Adjustment Notation Area to be mapped (e.g. Johnson county, Iowa) cs = cancer cases in age group s ns = population in age group s Area used as the standard (e.g. State of Iowa) Cs = cancer cases in age group s Ns = population in age group s
Indirect Standardization (relative risk)
Direct Standardization (relative risk)
Direct Standardization (rate) The crude state rate, if the whole state had the same age-specific rates as the county.
Direct Standardization Indirect Standardization Relative Risk Rate
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 1 200,000 400 Young Adults, 20-69 19 600,000 2200 Old Adults, 70+ 80 200,000 2400 Expected cases in county: 1.03
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 1 200,000 400 Young Adults, 20-69 19 600,000 2200 Old Adults, 70+ 80 200,000 2400 Expected cases in county: 1.03 County Cases Children, 0-19 0 0 1 Young Adults, 20-69 0 1 0 Old Adults, 70+ 1 0 0 Direct Standardization 0.5 6.3 40.0 Indirect Standardization 1.0 1.0 1.0
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 1 200,000 400 Young Adults, 20-69 19 600,000 2200 Old Adults, 70+ 80 200,000 2400 Expected cases in county: 1.03 County Cases Children, 0-19 0 0 1 0 0 1 Young Adults, 20-69 0 1 0 0 1 0 Old Adults, 70+ 1 0 0 2 1 1 Direct Standardization 0.5 6.3 40.0 1.0 6.8 40.5 Indirect Standardization 1.0 1.0 1.0 1.9 1.9 1.9
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 1 200,000 400 Young Adults, 20-69 19 600,000 2200 Old Adults, 70+ 80 200,000 2400 Expected cases in county: 1.03 County Cases Children, 0-19 0 0 1 0 0 1 0 0 1 Young Adults, 20-69 0 1 0 0 1 0 1 2 1 Old Adults, 70+ 1 0 0 2 1 1 2 1 1 Direct Standardization 0.5 6.3 40.0 1.0 6.8 40.5 7.3 13.1 46.8 Indirect Standardization 1.0 1.0 1.0 1.9 1.9 1.9 2.9 2.9 2.9
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 20 200,000 400 Young Adults, 20-69 60 600,000 2200 Old Adults, 70+ 20 200,000 2400 Expected cases in county: 0.5 County Cases Children, 0-19 0 0 1 0 0 1 0 0 1 Young Adults, 20-69 0 1 0 0 1 0 1 2 1 Old Adults, 70+ 1 0 0 2 1 1 2 1 1 Direct Standardization 2.0 2.0 2.0 4.0 4.0 4.0 6.0 6.0 6.0 Indirect Standardization 2.0 2.0 2.0 4.0 4.0 4.0 6.0 6.0 6.0
Indirect vs. Direct Standardization PopulationCases county state state Children, 0-19 1 200,000 2000 Young Adults, 20-69 19 600,000 6000 Old Adults, 70+ 80 200,000 2000 Expected cases in county: 1.0 County Cases Children, 0-19 0 0 1 0 0 1 0 0 1 Young Adults, 20-69 0 1 0 0 1 0 1 2 1 Old Adults, 70+ 1 0 0 2 1 1 2 1 1 Direct Standardization 0.25 3.2 20.0 0.5 3.4 20.2 0.7 6.6 23.4 Indirect Standardization 1.0 1.0 1.0 2.0 2.0 2.0 3.0 3.0 3.0
Indirect Standardization • With indirect standardization, estimates of rates and relative risks have lower variance. This is especially important for small areas such as counties or census tracts. • Method of choice for maps with estimates of multiple areas, showing geographical variation. • Use internal standard.
Breast Cancer Incidence, Relative Risks Age-Adjusted, Indirect Standardization
Indirect Standardization (relative risk) No need to know age-specific case counts in the county, only the total.
Direct Standardization (rate) No need to know case counts for the reference area.
Direct Standardization • Very useful to compare rates for areas studied • at different times, by different people, using different data sets. • Use external standards: • 1970 United States Population Standard • 2000 United States Population Standard • European Standard • World Standard
U.S 1970 and World Standards U.S. 1970World 0-4 8,442 12,000 5-9 9,820 10,000 10-14 10,230 9,000 15-19 9,384 9,000 20-24 8,056 8,000 25-29 6,632 8,000 30-34 5,625 6,000 35-39 5,466 6,000 40-44 5,896 6,000 U.S. 1970World 45-49 5,962 5,000 50-54 5,464 5,000 55-59 4,908 4,000 60-64 4,240 4,000 65-69 3,441 3,000 70-74 2,679 2,000 75-79 1,887 1,000 80-84 1,124 500 85+ 743 500 World Standard From: Waterhouse et al., Cancer Incidence in Five Continents, 1976
Iowa Breast Cancer Incidence Rates 1993-1996 Crude Rate: 136.4 / 100,000 women Age-Adjusted, Direct Standardization U.S. 1970 Standard Population: 106.4 / 100,000 U.S. 2000 Standard Population: 129.3 / 100,000 World Standard Population: 91.0 / 100,000
Conclusions • Use indirect standardization, with an internal standard, for mapping geographical variation. • Use direct standardization, with a few different standards, to calculate the rate for the map as a whole.
Uncertainty of Rate Estimates In a regular map, a relative risk of 2 could mean that there are 2000 cases with 1000 expected in an urban county, or 2 cases with 1 expected in a rural county. For the urban county, the relative risk of 2 is a good estimate of the true relative risk, but not for the rural county.
Probability Map For a particular county, one can test whether the observed cases are significantly more than expected, providing a p-value for that county. A map of these p-values is called a ‘probability map’. Reference: Chownowski M. Maps Based on Probabilities. Journal of the American Statistical Association, 54:385-388, 1959.
Probability Map (Poisson Data) m = expected number of cases c = observed number of cases
Probability Map p<0.05 0.05<p<0.10
County ‘p-values’ County Obs Exp RR p= Dubuque 275 235 1.17 0.004 Polk 892 817 1.09 0.004 Clayton 77 57 1.34 0.006 Mills 51 36 1.43 0.006 Scott 411 368 1.12 0.012 Linn 467 429 1.09 0.033 Marion 97 82 1.18 0.048
Regular vs. Probability Map p<0.05 0.05<p<0.10
Warning By chance, 5% of the counties will by chance have a ‘statistically significant’ p-value at the 0.05 level. Need to adjust for multiple testing.
Dilemma - Too little aggregation: Unstable rates. - Too much aggregation: Geographical variation in disease may not follow political boundaries. Solution: Smoothed Maps