300 likes | 437 Views
Geographic variation of mortality with different socioeconomic indicators using Multivariate multiple regression model. Jurairat Ardkaew BOD - International Health Policy Program - IHPP. Objective.
E N D
Geographic variation of mortality with different socioeconomicindicators using Multivariate multiple regression model Jurairat Ardkaew BOD - International Health Policy Program - IHPP
Objective To examine mortality pattern by age sex and socio-economicindicatorsacross administrative superdistricts in Thailand during the latest census period (1999-2001).
Data source • The data for mortalitycases are available from vital registration, Ministry of Public Health. • The number of population by region was obtained from population and household census 2000. • The socioeconomic indicators were obtained from 100% population and household census 2000 and 20% population and household census 2000.
When there are several (i>1)criterion variables,we could just fit i separate models … • But this: • Doesnot give simultaneous tests for all regressions. • Dose not take correlation among the y’s into account. Why do multivariate test? • Avoid multiplying errorrates, as in ANOVA • Overalltest for multiple responses – similar to overall test for many group. • Often, multivariate test are more powerful, when the responses are correlated. • Multivariate test provide a way to understand the structure of relation across separate response measures. Multivariate Regression
Multivariate Multiple Regression Model • The multivariate multiple regression model is • y1… yi = x1 x2… xjβ1…βi+ Enxi • may be expressed simply in matrix form as • Ynxi = XnxjBjxi+ εnxi • The LS solution, B=(XTX)-1XTY gives same coefficients as fitting i models separately.
Application for the this study It would be surprising if there were no correlations between successive age groups. To incorporate these correlations in a quite general way, we can use a matrix formulation of the model. outcome variable (Yrx) : mortality rate explanatory variables (Xrj): observed socio-economic indicators Suppose that Y is the matrix of outcome variables f(mrx) = log(mrx), where the columns correspond to nA age groups (0,1-4,…, 80-84) and the rows correspond to nR regions (235 superdistricts), and X is the matrix with rows also corresponding to regions and p+2 columns (), where the first column contains 1s, the next p columns contain the observed socio-economic predictors, and the last column contains the unobserved explanatory variable (obtained from the least-squares fit), and r denotes the region (such as a ‘super-district’, a district or group of contiguous districts within the same province having population approximately 200,000 persons).
Then the model where gr,p+1 = (an explanatory variable encapsulating the unobserved information on how mortality varies with region). may be expressed simply in matrix form as Y = X B where B is the p+2 xnA matrix of parameters (ax,bjx). This model is easily fitted using multivariate multiple regression analysis.
Multivariate Multiple Regression Analysis: Example This model allows correlations between errors corresponding to different outcomes but assumes independent errors within each outcome variable. This model is fitted separately to all-cause male and female mortality rates in the 235 superdistricts (r = 235) of Thailand, for the period 1999-2001. The 6 selected Socioeconomic indicators (p=6) • pop.density (in1000s of persons per square km) • prop.Agriculture population • prop. population who live out municipal • prop.Aged15+&Grad >= Secondary1 School • prop.Households that No Toilet • prop.Households that have Pipe Water Supply inside the house
Distribution of SE indicators in each region Max = 32.83 Min = 0.02 Mean = 1.49 Max = 0.96 Min = 0.00 Mean = 0.69
Distribution of SE indicators in each region Max = 0.91 Min = 0.0007 Mean = 0.52 Max = 0.73 Min = 0.16 Mean = 0.34
Distribution plot of SE indicators in each region Max = 0.16 Min = 0.0002 Mean = 0.02 Max = 0.96 Min = 0.05 Mean = 0.41
These values are high when the mortality is high (in age group 5-9 and the age groups 15-19,20-24, 25-29, .., 65-69). male: coef (std.error) Significant code : a = 0.001 , b = 0.01, c = 0.05, d = 0.1 The model gives an r-squared for each age group.
These values are high when the mortality is high (in age group 5-9 and the age groups 15-19,20-24, 25-29, .., 65-69). female : coef (std.error) Significant code : a = 0.001 , b = 0.01, c = 0.05, d = 0.1 The model gives an r-squared for each age group.
Unobserved mortality in each region For male, unobserved mortality is general low in super district of southern region and high in most of super districts ofin Northern region practically,ChaingRai, Chiangmai, Phayaoand Phare and some super districts in Burirum. For female, low and high unobserved mortality occur in the similar areas.
Correlations between Residuals in Age Groups male female
R Mapping Drawing map using R program • Thematic Map • Thematic maps are data maps of a specific subject or for a specific purpose. • Display data according to reference base. (such as : comparing mean with tail of 95%CIs of subject) • Range Map • Display data according to range set by users. • The ranges are shaded using color.
Data structure … … … … … … Example : Childhood diarrhea incidence in 5 border provinces of Northeast Thailand : 1999-2004
Example : All cause of death age 0-84 in Thailand (1999-2001) Data structure MortM: mortality/1000 of male QM : quintile of mortality/1000 of male MortF: mortality/1000 of female QF : quintile of mortality/1000 of female