300 likes | 419 Views
NET ACCESS: Highlights from British Columbia. Michael Regier BAR, BSc, MSc PhD Student UBC, Department of Statistics PhD Student BCCA, Research Student/NET Trainee. Overview. Understanding the logic in ecological variables
E N D
NET ACCESS: Highlights from British Columbia Michael Regier BAR, BSc, MSc PhD Student UBC, Department of Statistics PhD Student BCCA, Research Student/NET Trainee
Overview • Understanding the logic in ecological variables • Trends in the place of death of BC cancer patients (1999-2003): An alternate perspective • Multilevel analysis using health and census data
An observation and a question • Prior research uses a mixed-level ecological model. • Are mixed-level ecological studies necessary and appropriate for understanding the social determinants of health of cancer patients in British Columbia, whose primary (and/or secondary) cause of death was due to cancer, when utilizing an ecosocial research schema?
What is a mixed-model? • Let X represent aggregate (group) level explanatory variable • Let x represent individual (patient) level explanatory variable • Let Y represent aggregate (group) level response variable • Let y represent individual (patient) level response variable • Individual level model: • x→ y • yi=b0+b1xi+b2xi+ei • Aggregate level model: • X→ Y • Yi=b0+b1Xi+b2Xi+ei • Mixed-model: • {X,x} → y • yi=b0+b1xi+b2Xi+ei
What is an ecological variable? • An ecological variable is any measure taken at an aggregated level with respect to the unit of interest. • For the NET ACCESS research the aggregate level is taken as a collection of people (dissemination area, neighbourhood, local health authority, legally incorporated boundaries – CSD, etc.)
Problems with ecological variables • It is suggested that any study where the parsimonious model maps {x, X} to y contains cross-level bias. • Cross-level bias is the sum of the aggregation bias due to groupings • Morgenstern and Susser claim that any study which has cross-level bias participates in the ecological fallacy. • Ecological Fallacy • Individual-level conclusions are based upon analysis done with aggregate data (e.g. neighbourhood, Local Health Region) • Individualistic Fallacy • Individual level data is sufficient for epidemiological studies • Geronimus, A. T., Bound, J., Neidert, L. J., 1996. On the validity of using census geocode characteristics to proxy individual socioeconomic characteristics. Journal of the American Statistical Association 91(434): 529-537. • Morgenstern, H. 1982. Uses of Ecological Analysis in Epidemiological Research. American Journal of Public Health 72(12): 1336-1344. • Susser, M. 1994. The logic of ecological: I. The logic of analysis. American Journal of Public Health 84(5):825-829.
What is ecosocial research? • Ecosocial: The consideration of how population health is generated by social conditions necessarily engaging with biological processes at every spatial-temporal scale (embodiment) • Embodiment: The biological incorporation of the material and social world in which we live, from conception to death. No aspect of our biology can be understood in isolation from the knowledge of history and individual and societal ways of living • Krieger N. Theories for social epidemiology in the 21st century: an ecosocial perspective. International Journal of Epidemiology. 2001; 30:668-677.
Are mixed-level ecological studies necessary and appropriate? • Evolving conclusions ... • 2005 Conclusion: Yes! • Specification of Variables • Understanding the conceptual aspects of the ecological variables • Recognition of potential bias in mixed-level models • 2006 Conclusion: Maybe? • Disregards natural structure of the data • Unresolved biases • Do we really have the best unit of observation?
Trends in the place of death of BC cancer patients (1999-2003): An alternate perspective
Background • Health Background • If given a choice, up to 88% of patients dying of cancer would choose to die at home, yet only a smaller proportion realizes this preference • In Nova Scotia (NS) Canada, out of hospital deaths were 25.4% the period from 1992 to 1997 • Out-of-hospital deaths in Ontario have ranged from 31% in 1980 to 21.5% in 1994 to 34% in 1997 • In Manitoba, 57.4% of cancer patients die in the hospital during the fiscal year of 2000/01 • Modelling Background • Logistic regression is almost exclusively used in EOL research • Popularity ≠ Appropriateness
Objectives • To compare the performance of five classifiers • To comment upon the clinical utility of the classifiers • To understand which factors are predictive of dying out of hospital
Subjects • The subjects were all persons who • were age 20 years and older, • lived in BC as identified from the death certificate and the BC cancer registry, • died in BC as identified from the death certificate and the BC Cancer registry, • died of a malignant cancer (primary cause of death or secondary cause of death) in the period 1997 to 2003, and • had a record of prior knowledge of cancer. • N=36,432
Variables • Dependent Variable • Location of Death (in/out of hospital) • Independent Variables • Year of death • Sex • Age (continuous) • Neighbourhood Income Per Person Equivalent (IPPE) Quintile • Cause of death (coded according to the International Classification of Diseases O, third revision 18) grouped according to the Canadian Cancer Society groupings • City Size (categorical), • Ethnic Origin Context, • Mother Tongue Context, and • Religious Context. • Survival (continuous)
Indicators of Culture • Ethnic Origin context • Heterogeneous, • Aboriginal Presence, • Chinese Presence, • British Isles Presence, • East Indian Presence, and • Americas and Europe. • Mother Tongue context • Punjabi Presence, • Dominant English, • Heterogeneous with Dominant English, • Chinese Presence, and • Strong English Presence. • Religious context • Heterogeneous with Protestant and No Religion Presence • Catholic Presence, • Protestant Presence, • Sikh Presence, • No Religion Presence, and • Heterogeneous.
Classifiers Used • Linear Discriminant Analysis • Logistic Regression • Neural Networks • Classification Trees • Nearest Neighbours
Performance Criteria • Criteria Used • Misclassification • Receiver Operating Characteristic (ROC) Curve • Area Under the ROC Curve (AUC) • Hit Curve (Lift Curve) • Estimating the Performance Criterion • Discussed the bias-variance trade-off • complex models reduce bias (over-fitting) • simple models reduce variance (under-fitting) • Used Cross-validation (deterministically) to prevent overly optimistic estimates, over-fitting, and under-fitting • Adapted cross-validation to accommodate “fine-tuning”
Performance Groups NN LOG (F) KNN CT LOG (P) LDA Classification Results
Health Results (Logistic Regression) • 64.2% died in hospital • Patients who were more likely to die out of hospital were • Females, • Those who live in neighbourhoods in a higher income per person equivalent quintile, • Those who live in either the Interior or Northern Health Authorities, • Those who live in communities with less than 500,000 • Those who died from breast, colorectal, pancreatic or prostate cancer. • Older, and • Those with longer survival times. • Ethnically, patients from an area with a/an • Aboriginal construct, • British Isles construct, and • Americas and Europe construct were less likely to die out of hospital when compared against the Heterogeneous construct • Linguistically, patients from an area with a/an • Chinese construct • Punjabi construct were less likely to die out of hospital than the dominant English construct.
Multilevel analysis using health and census data A preliminary investigation
Context • This investigation was to determine the feasibility and utility of multilevel modelling for palliative care and end-of-life research
Subjects and Variables • Subjects • All British Columbia females who died in BC between 1999 and 2003 of either breast, lung or colorectal cancer. • 9,104 individuals as identified in the BCCA cancer registry • Two patients were removed as they had no survival information • Breast, colorectal and lung cancer were considered as they represent the top three age-standardized mortality rates for women in BC in 2002 (Canadian Cancer Society 2002) • Canadian Cancer Society/National Cancer Institute of Canada: Canadian Cancer Statistics 2006, Toronto, Canada, 2006 • Variables • Response • Location of death (In/out of hospital) • Patient level • age, survival, cause of death, year of death • Dissemination Area level • Income
Geographical Support for a Multilevel model: An Illustration • Red = Number of deaths in hospital greater than number of deaths outside of the hospital • Even with these crude measures, geographic variation can be seen
Graphical support for the development of the multilevel model • Mosaic plot comparing income, place of death and cause of death by year of death. Hospital deaths (vertical axis) are denoted by 1 with non-hospital deaths denoted by 0. Highest income quintile (horizontal axis) is 1 with lowest denoted by 5 and unknown by 9.
Measures used to construct the generalized multilevel model • Extra-Binomial Variation • ICC • Explained Variance • Deviance
Models • Empty Model (Intercept only) • Random Intercept with level-1 covariates • Age • Age + Survival • Age + Survival + Cause of Death • Age + Survival + Cause of Death + Year of Death • Random Slope for age covariate • Level-2 Covariate (Income)
Results and Conclusions • Many estimation problems identified (non-convergence) • Multilevel modelling will have high utility for EOLDB based research
Thank You