370 likes | 561 Views
The Swiss National Cohort: The benefits and challenges of constructing a nationwide census-based longitudinal study without personal identifiers. Marcel Zwahlen Institute of Social and Preventive Medicine , University of Bern http://www.ispm.ch/ http://www.swissnationalcohort.ch/. Outline.
E N D
The Swiss National Cohort: The benefits and challenges of constructing a nationwide census-based longitudinal study without personal identifiers Marcel Zwahlen Institute ofSocialandPreventiveMedicine, University of Bern http://www.ispm.ch/ http://www.swissnationalcohort.ch/
Outline • Anatomyofthe Swiss National Cohort • Linkagemethodsused • Howto deal with non-linkeddeaths • Examplesofpublishedandongoingwork • Outlook / Future developments
Standardizeddeathrates in NUTS2 territorial units • Standardized Death rates per 100‘000, 2002-2004 Switzerland 41,285 km2 7.9 million population
The SNC • Cohort of all people living in Switzerland • Record linkage study - linking census and mortality data • National project involving all ISPMs, Swiss TPH, Institutd’EtudesDémographiques et du Parcours de Vie (in Geneva) • Funded by Swiss National Science Foundation, Cancer Research Switzerland, and Swiss School of Public Health
Core structureof SNC (Bopp et al. 2009 and Spoerri et al. 2010)
Data and variables • Census • Individual-level variables • Householdcharacteristics • Buildingcharacteristics (includinggeocodes) • Mortalityregistry • Causesofdeath • Registry offoreign nationals • Emigration, immigration, naturalization
Where does the data come from? • All core data • is made available from the Federal Statistical Office (FSO) • through collaboration contracts between Universities of Bern and Zürich and FSO (≈ 9 months of negotiations) • conforming to the Federal Law on Statistics
Methods • No unique identifier available in census and mortality data • Linkage is performed using probabilistic record linkage methods on common information • Sex, date of birth, marital status, nationality, religion and place of residence, • Information on spouse and family • Automated using GRLS
Core results • 6.87 million persons at 1990 census • 81.9% linked to 2000 census, 2.6% to an emigration record and 8.6% to a death record • 476‘814 (6.9%): no link found • 1‘052‘527 death records linked to either 1990 or 2000 census (93.8%) • >100 million person-years of follow up
The problem with the unlinked deaths • Almost all deaths occurring between 1991 and 2008 should be attributed to persons in 1990 and 2000 census • Few exceptions: • born in 1992 and died in 1996 • Immigrated in 2002 and died in 2007 • Linkage is not successful if • Persons move across communities • Persons live single • Household structure changes
Bopp et a. International Journal of Epidemiology 2009;38:379–384
Higher proportion of unlinked deaths in children and younger adults • In absolute numbers more unlinked deaths at higher age
Absolute mortality rates 1991 – 2007 Rate SNC: numerator = SNC death certificates linked to census 1990 and 2000 denominator = person-time at risk
Absolute mortality rates 1991 – 2007 Rate SNC: numerator = SNC death certificates linked to census 1990 and 2000 denominator = person-time at risk Reference rate: numerator = all death certificates denominator = midyear reference population of Swiss Statistics
Allocation of unlinked deaths Unlinkeddeathrecords 05.12.1990 < dod < 31.12.2007 Censusrecords Not linkedtodeathcertificate Census 1990 / census 2000 Logical prerequisite Noinformation after dod (e.g. emigration) Not assignabletocensus 90 if also in 00 Matchingcriteriaround 1 Gender Canton Nationality (CH/non-CH) Civilstatus Same orverysimilarage Linked pair Randomlyselectedwithinbestmatches Keep forround 2 ifnomatch Matchingcriteriaround 2 (lessstrict) Gender „Grossregion“ (7 regions) similarage Linked pair Randomlyselectedwithinbestmatches
Mortality rates before and after allocation of unlinked deaths „initial“ SNC rates After allocationofunlinkeddeaths
„initial“ SNC rates After allocationofunlinkeddeaths
Are rate ratios affected by using or not the „SNC unlinked deaths“ ? Education and lung cancer mortality • Marginal change in hazard ratios for other variables as marital status, nationality (manuscript in preparation)
A few examples of completed and ongoing projects A complete list of publications to be found on http://www.swissnationalcohort.ch/
Religion and suicide Spoerri A et al. Int J Epidemiol 2010; 39: 1486-94 Cox regression adjusted for age, education, marital status and type of household.
Mortality from coronary heart disease and stroke significantly decreased with increasing altitude. • By 22% (95CI: 18%-27%) per 1000 m altitude for CHD • By 12% (95CI: 5%-19%) per 1000 m altitude for stroke Faeh et al. Circulation. 2009;120:495-501
Mortality from myocardial infarction and estimated exposure to aircraft noise, Switzerland, 2000–2005. Huss et al. Epidemiology 2010; 21(6): 829-36.
Study of childhood cancer and nuclear power plants in Switzerland This study • uses SNC for the person-time at risk by distance to NPPs • uses childhood cancer cases registered in the Swiss Childhood Cancer Registry (years 1985 to 2009) • conducted over the last 3 years and funded by cancer charity organisation (Swiss Cancer League) and government agency (Swiss Federal Office of Public Health) • would not have been possible in that short time without having SNC and childhood cancer registry being operative since many years
Nuclear Power Plants in Switzerland Leibstadt (since1984) Gösgen(since1979) Beznau I und II (since 1969/1971) Mühleberg(since1972) Results are in press in International Journal of Epidemiology (epub expected in the next 2-3 weeks) 5 km Radius 15 km Radius
New census 2010 • Registry based (“no door to door”) • Annually updated personal information • Includes newly created personal ID Potential for linking additional data
Will we be able to use this new id? Privacy preserving linkage • Collaboration with University of Duisburg • Fully safe encryption of names or unique identifiers • Small differences in records result in similar encryption strings • Suitable for probabilistic linkage • Creation of a Swiss Linkage Centre
Summary / Conclusion • Lack of an unique personal identifier complicated the process to create this large population based cohort Probabilistic record linkage • ≈ 6% unlinked deaths resulted in incorrect absolute rates perform 2nd rate linkage • Future: Use encrypted version of personal ID to improve linkage while complying with data protection issues.
Acknowledgments • Adrian Spoerri, Kurt Schmidlin, Kerri Clough-Gorr, Radoslav Panczak, Matthias Egger (Institute of Social and Preventive Medicine in Bern) • Nino Künzli, Charlotte Braun-Fahrländer, Katharina Stähelin, Christian Schindler (Basel) • Matthias Bopp, Felix Gutzwiller, David Fäh (Zürich) • Fred Paccaud (Lausanne) • Michel Oris (Geneva) • Stephan Cottier (Swiss Federal Statistical Office) • Swiss National Science Foundation, Cancer Research Switzerland, Swiss School of Public Health+
Funding of SNC core activities • Through specific research grants from the Swiss National Science Foundation • Is part of a special funding scheme for cohort studies • A first 5year funding form July 2005 to June 2010 • An extension for July 2010 – June 2013 (with possible prolongation for additional 2 years)
Bayes’ theorem Likelihood ratio Prior odds Posterior odds Weight (var1) ~ ln(likelihood ratio) • Literature: • Fellegi, Ivan P and Sunter, Allan B. A theory of record linkage. Journal of the American Statistical Association 1969 • Newcombe, H. B., Kennedy, J. M., Axford, S. J., and James, A. P. Automatic linkage of vital records. Science 130, 1959
Selection of linked pairs • All potential pairs are sorted by total weight • Thresholds are defined by manual review Total linkage weight Definite pairs Upper threshold Possible pairs Lower threshold Rejected pairs