370 likes | 506 Views
CMGPD-LN Methodological Lecture. Day 7 Health and Mortality. Mortality outcomes. Until age 75, recording of mortality appears plausible Age patterns resemble other historical populations, model life tables After age 75, mortality record is problematic
E N D
CMGPD-LNMethodological Lecture Day 7 Health and Mortality
Mortality outcomes • Until age 75, recording of mortality appears plausible • Age patterns resemble other historical populations, model life tables • After age 75, mortality record is problematic • Many immortals were taoding at some point, so for mortality analysis perhaps safest to throw out all records of anyone who was taoding • Rates below age 5 appear normal, but representativeness of registered children is unclear • Large numbers of deaths allow for fine-grained analysis of mortality determinants
. use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from > ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear (China Multi-Generational Panel Dataset, Liaoning (CMGPD-LN) > , 1749-1909, Liaoning) . recode AGE_IN_SUI min/0=. 1/15=1 16/55=16 56/max=56 (AGE_IN_SUI: 1478270 changes made) . keep if NEXT_DIE >= 0 & NEXT_3 & PRESENT (653682 observations deleted) . keep if SEX >= 1 (1 observation deleted) . tab AGE_IN_SUI SEX if NEXT_DIE | Sex Age in Sui | Female Male | Total -----------+----------------------+---------- 1 | 1,189 5,132 | 6,321 16 | 11,160 10,721 | 21,881 56 | 11,342 11,923 | 23,265 -----------+----------------------+---------- Total | 23,691 27,776 | 51,467
Analyzing mortality • Life tables • Remember, ages are in sui • Probability of death in next three years (3qx) • Need to be converted to mx to put into a life table • One crude conversion: mx = -ln(1-3qx)/3 • More sophisticated conversions are appropriate at early ages when rates are changing fast • Discrete-time event-history analysis • Logistic regression • Complementary log-log regression
Life tablesA crude approach keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 75 & NEXT_3 & PRESENT & SEX > 0 * Divide into five year age groups replace AGE_IN_SUI = 5*int((AGE_IN_SUI-1)/5)+1 tab AGE_IN_SUI SEX collapse NEXT_DIE, by(AGE_IN_SUI SEX) sort SEXAGE_IN_SUI
. tab AGE_IN_SUI SEX | Sex Age in Sui | Female Male | Total -----------+----------------------+---------- 1 | 5,026 37,223 | 42,249 6 | 7,881 53,337 | 61,218 11 | 8,334 51,932 | 60,266 16 | 20,835 47,582 | 68,417 21 | 35,747 46,067 | 81,814 26 | 37,344 44,648 | 81,992 31 | 34,870 40,533 | 75,403 36 | 32,342 37,912 | 70,254 41 | 30,347 35,131 | 65,478 46 | 27,330 30,170 | 57,500 51 | 24,282 26,714 | 50,996 56 | 20,898 22,568 | 43,466 61 | 16,949 17,566 | 34,515 66 | 13,143 12,664 | 25,807 71 | 9,014 8,072 | 17,086 -----------+----------------------+---------- Total | 324,342 512,119 | 836,461
Event-history analysis keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 75 & NEXT_3 & PRESENT & SEX > 0 replace AGE_IN_SUI = 5*int((AGE_IN_SUI-1)/5)+1 xi:logit NEXT_DIE i.AGE_IN_SUI i.SEX i.REGION
------------------------------------------------------------------------------------------------------------------------------------------------------------ NEXT_DIE | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _IAGE_IN_~_6 | -1.077269 .0301595 -35.72 0.000 -1.136381 -1.018158 _IAGE_IN_~11 | -1.453427 .0342583 -42.43 0.000 -1.520572 -1.386282 _IAGE_IN_~16 | -1.275694 .0307841 -41.44 0.000 -1.33603 -1.215358 _IAGE_IN_~21 | -1.134171 .0279815 -40.53 0.000 -1.189014 -1.079328 _IAGE_IN_~26 | -1.068992 .0274992 -38.87 0.000 -1.122889 -1.015094 _IAGE_IN_~31 | -.9322853 .0271684 -34.32 0.000 -.9855344 -.8790363 _IAGE_IN_~36 | -.7535797 .0264842 -28.45 0.000 -.8054878 -.7016715 _IAGE_IN_~41 | -.5966655 .0259978 -22.95 0.000 -.6476202 -.5457108 _IAGE_IN_~46 | -.4034962 .0257241 -15.69 0.000 -.4539145 -.353078 _IAGE_IN_~51 | -.1480721 .0250983 -5.90 0.000 -.1972639 -.0988803 _IAGE_IN_~56 | .194831 .0244138 7.98 0.000 .1469809 .2426811 _IAGE_IN_~61 | .5058013 .024371 20.75 0.000 .4580351 .5535676 _IAGE_IN_~66 | .9441143 .024353 38.77 0.000 .8963834 .9918453 _IAGE_IN_~71 | 1.246485 .0257523 48.40 0.000 1.196011 1.296958 _ISEX_2 | -.107873 .0102132 -10.56 0.000 -.1278905 -.0878555 _IREGION_2 | .0075932 .0117758 0.64 0.519 -.015487 .0306734 _IREGION_3 | -.1400285 .0138099 -10.14 0.000 -.1670953 -.1129616 _IREGION_4 | -.2427861 .017067 -14.23 0.000 -.2762367 -.2093354 _cons | -2.300452 .0209234 -109.95 0.000 -2.341461 -2.259443 ------------------------------------------------------------------------------
Accounting for age and sex • We generally analyze childhood, working ages, and old age separately • Since relevant variables vary, as do their effects • We often, but not always, analyze males and females separately • Because effects of key variables may vary by sex • Categorical variable for age group • See previous example • Polynomial generate age2 = age^2 generate age3 = age^3 logit NEXT_DIE age age2 age3 • Hybrid • Include age group categories and linear term for age • To capture variation in risks within age groups
Other notes on mortality analysis • Since many of the ‘immortals’ were taoat some point in their life, maybe worthwhile to throw out observations of anyone who was ever tao, even if they aren’t tao right now. • Regional differences in mortality rates suggest inclusion of REGION as a basic control variable.
Using the disability variables • Basic contents • Time trends • Age patterns • Working with the original disabilities • And positions…
Working with the original disabilities use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0003\27063-0003-Data.dta" merge m:1 DATASET DISABILITY_CODE using "C:\Users\Cameron Campbe\Documents\Baqi\extracts\CMGPD-LN Disability for SJTU class",keep(match master) tab CONDITION_PINYIN, sort run "C:\Users\Cameron Campbe\Documents\Dropbox\Lee-Campbell group (Dropbox shares)\SJTU DongbeiZhongxin\SJTU Summer Class\strip_disability.do“ tab new_CONDITION_PINYIN, sort generate byte lao_zheng = index(new_CONDITION_PINYIN,"laozheng") > 0 tab lao_zheng
.do file to clean up generate new_CONDITION_PINYIN = CONDITION_PINYIN local for_removal "1 2 3 4 5 6 7 8 9" foreach x of local for_removal { replace new_CONDITION_PINYIN = subinstr(new_CONDITION_PINYIN,"`x'","",.) }
. tab CONDITION_PINYIN, sort Disease | Freq. Percent Cum. --------------------------------------+----------------------------------- chen2 tao2 | 1,238 10.93 10.93 lao2 zheng4 | 741 6.54 17.48 chen2 lao2 zheng4 | 574 5.07 22.55 yan3 xia1 | 462 4.08 26.62 chen2 xia1 | 388 3.43 30.05 chen2 tao2 you3 an4 | 300 2.65 32.70 can2 ji2 | 297 2.62 35.32 tu3 xie3 | 267 2.36 37.68 xia1 zi5 | 259 2.29 39.97 tui3 que2 | 234 2.07 42.03 tui3 tong4 | 190 1.68 43.71 chen2 tui3 que2 | 178 1.57 45.28 tui3 huai4 | 167 1.47 46.76 er3 long2 | 166 1.47 48.23 lao2 bing4 tu3 xie3 | 159 1.40 49.63 yan3 ji2 | 154 1.36 50.99 yao1 huai4 | 148 1.31 52.30 lou4 chuang1 | 121 1.07 53.36 lao3 tui4 | 108 0.95 54.32 chen2 tu3 xie3 | 107 0.94 55.26 xia1 yan3 yan3 ji2 | 107 0.94 56.21 yang2 gao1 feng1 | 107 0.94 57.15
. tab new_CONDITION_PINYIN, sort new_CONDITION_PINYIN | Freq. Percent Cum. --------------------------------------+----------------------------------- chentao | 1,238 10.93 10.93 laozheng | 741 6.54 17.48 chenlaozheng | 574 5.07 22.55 yanxia | 462 4.08 26.62 chenxia | 388 3.43 30.05 can ji | 307 2.71 32.76 chentao you an | 300 2.65 35.41 tuxie | 272 2.40 37.81 xiazi | 260 2.30 40.11 tuique | 234 2.07 42.18 tui tong | 190 1.68 43.85 chentuique | 178 1.57 45.43 tuihuai | 167 1.47 46.90 er long | 166 1.47 48.37 laobingtuxie | 159 1.40 49.77 yanji | 154 1.36 51.13 yaohuai | 148 1.31 52.44 louchuang | 121 1.07 53.51 gebohuai | 113 1.00 54.50 laotui | 108 0.95 55.46
. generate byte lao_zheng = index(new_CONDITION_PINYIN,"laozheng") > 0 . tab lao_zheng lao_zheng | Freq. Percent Cum. ------------+----------------------------------- 0 | 1,511,910 99.90 99.90 1 | 1,447 0.10 100.00 ------------+----------------------------------- Total | 1,513,357 100.00
Preceding birth interval use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear drop if MOTHER_ID == "-99" | BIRTHYEAR < 0 | (SEX == 1 & MARITAL_STATUS != 2) bysort PERSON_ID: keep if _n == 1 bysort MOTHER_ID (BIRTHYEAR): generate pbi = BIRTHYEAR - BIRTHYEAR[_n-1] bysort MOTHER_ID (BIRTHYEAR): generate firstborn = _n == 1 * Basically force firstborn and twin into separate categories represented by the dummy variables bysort MOTHER_ID (BIRTHYEAR): replace pbi = 0 if firstborn recode pbi 15/max=15 tab pbi keep PERSON_ID pbi firstborn save pbi
pbi | Freq. Percent Cum. ------------+----------------------------------- 0 | 76,026 51.57 51.57 1 | 4,385 2.97 54.54 2 | 10,152 6.89 61.43 3 | 9,843 6.68 68.11 4 | 7,615 5.17 73.27 5 | 6,238 4.23 77.50 6 | 5,478 3.72 81.22 7 | 4,339 2.94 84.16 8 | 3,569 2.42 86.58 9 | 3,138 2.13 88.71 10 | 2,669 1.81 90.52 11 | 2,063 1.40 91.92 12 | 1,945 1.32 93.24 13 | 1,456 0.99 94.23 14 | 1,292 0.88 95.11 15 | 7,215 4.89 100.00 ------------+----------------------------------- Total | 147,423 100.00
use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge m:1 PERSON_ID using pbi, keep(match master) keep if SEX == 2 bysort PERSON_ID (YEAR): keep if AGE_IN_SUI[1] > 0 & AGE_IN_SUI[1] <= 10 keep if AT_RISK_DIE == 1 & NEXT_3 == 1 & PRESENT == 1 generate short_pbi = firstborn == 0 & (pbi == 0 | pbi == 1 | pbi == 2) generate age_group = 1+5*int((AGE_IN_SUI-1)/5) xi:clogit NEXT_DIE i.age_group firstborn short_pbi if age_group >= 56 & age_group <= 75, group(MOTHER_ID)
. xi:clogit NEXT_DIE i.age_group firstborn short_pbi if age_group >= 56 & age_group <= 75, group(MOTHER_ID) i.age_group _Iage_group_1-166 (naturally coded; _Iage_group_1 omitted) note: multiple positive outcomes within groups encountered. note: 8860 groups (19131 obs) dropped because of all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 9902 LR chi2(5) = 2041.71 Prob > chi2 = 0.0000 Log likelihood = -2389.4246 Pseudo R2 = 0.2993 ------------------------------------------------------------------------------ NEXT_DIE | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Iage_gro~_6 | (omitted) … _Iage_gro~51 | (omitted) _Iage_gr~_56 | -4.255857 .1234024 -34.49 0.000 -4.497721 -4.013993 _Iage_gro~61 | -2.764736 .1071336 -25.81 0.000 -2.974714 -2.554758 _Iage_gr~_66 | -1.425874 .0934471 -15.26 0.000 -1.609027 -1.242721 _Iage_gro~71 | (omitted) … _Iage_gr~166 | (omitted) firstborn | -.2756034 .1105682 -2.49 0.013 -.4923131 -.0588938 short_pbi | .300082 .1539756 1.95 0.051 -.0017047 .6018687 ------------------------------------------------------------------------------
Age at which father last seen alive use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0003\27063-0003-Data.dta" keep if SEX == 2 bysort PERSON_ID (YEAR): keep if AGE_IN_SUI[1] <= 10 & AGE_IN_SUI[1] >= 1 drop if FATHER_ALIVE < 0 drop if AGE_IN_SUI < 0 bysort PERSON_ID (FATHER_ALIVE YEAR): generate father_last_alive = AGE_IN_SUI[_N] bysort PERSON_ID (FATHER_ALIVE YEAR): replace father_last_alive = 0 if FATHER_ALIVE[_N] == 0 recode father_last_alive 1/5=1 6/10=6 11/15=11 16/max=16 generate ever_married = MARITAL_STATUS != 2 tab father_last_alive if SEX == 2 & AGE_IN_SUI >= 26 & AGE_IN_SUI <= 30, sum(ever_married) tab father_last_alive if SEX == 2 & AGE_IN_SUI >= 26 & AGE_IN_SUI <= 30 & HAS_POSITION >= 0, sum(HAS_POSITION)
. tab father_last_alive if SEX == 2 & AGE_IN_SUI >= 26 & AGE_IN_SUI <= 30, sum(ever_married) father_last | Summary of ever_married _alive | Mean Std. Dev. Freq. ------------+------------------------------------ 0 | .72305186 .4475514 3683 1 | .6979405 .45925601 2185 6 | .72417511 .44697719 4637 11 | .71541591 .45126712 4424 16 | .74787225 .43424031 35601 ------------+------------------------------------ Total | .73888779 .43924531 50530
. tab father_last_alive if SEX == 2 & AGE_IN_SUI >= 26 & AGE_IN_SUI <= 30 & HAS_POSITION >= 0, sum(HAS_POSITION) father_last | Summary of Has Official Position _alive | Mean Std. Dev. Freq. ------------+------------------------------------ 0 | .01466196 .12021194 3683 1 | .01464531 .12015586 2185 6 | .00841061 .09133275 4637 11 | .0187613 .13569627 4424 16 | .01949383 .1382547 35601 ------------+------------------------------------ Total | .01785078 .13241027 50530
Another approach to identifying age at last time father was observed use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & PRESENT == 1 & AGE_IN_SUI > 0 bysort PERSON_ID (YEAR): keep if _n == _N keep PERSON_ID YEAR rename PERSON_ID FATHER_ID rename YEAR father_last_year save father_last_year, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if FATHER_ID != "-99" keep if SEX == 2 merge m:1 FATHER_ID using father_last_year, keep(match master) drop if father_last_year == . keep if BIRTHYEAR > 0 generate age_at_father_last_year = father_last_year - BIRTHYEAR recode age_at_father_last_year min/-11=-99 -10/0=0 1/5=1 6/10=6 11/15=11 16/max=16 tab age_at_father_last_year if HAS_POSITION >= 0 & AGE_IN_SUI >= 31 & AGE_IN_SUI <= 35, sum(HAS_POSITION) generate ever_married = MARITAL_STATUS != 2 tab age_at_father_last_year if MARITAL_STATUS >= 1 & AGE_IN_SUI >= 31 & AGE_IN_SUI <= 35, sum(ever_married)
. tab age_at_father_last_year age_at_fath | er_last_yea | r | Freq. Percent Cum. ------------+----------------------------------- -99 | 26,808 3.24 3.24 0 | 37,958 4.58 7.82 1 | 53,882 6.51 14.32 6 | 70,491 8.51 22.83 11 | 82,367 9.94 32.78 16 | 556,793 67.22 100.00 ------------+----------------------------------- Total | 828,299 100.00
. tab age_at_father_last_year if HAS_POSITION >= 0 & AGE_IN_SUI >= 31 & AGE_IN_SUI <= 35, sum(HAS_POSITION) age_at_fath | er_last_yea | Summary of Has Official Position r | Mean Std. Dev. Freq. ------------+------------------------------------ -99 | .01860465 .13520271 860 0 | .01583435 .12485973 2463 1 | .01697793 .12921048 2945 6 | .01438987 .11910299 5212 11 | .01950475 .13830241 5896 16 | .0251456 .15656902 43785 ------------+------------------------------------ Total | .022825 .14934653 61161
tab age_at_father_last_year if MARITAL_STATUS >= 1 & AGE_IN_SUI >= 31 & AGE_IN_SUI <= 35, sum(ever_married) age_at_fath | er_last_yea | Summary of ever_married r | Mean Std. Dev. Freq. ------------+------------------------------------ -99 | .80913349 .39321436 854 0 | .78486708 .41099859 2445 1 | .77785396 .41576007 2917 6 | .78262556 .41249944 5157 11 | .78500514 .41085395 5842 16 | .81671433 .38690501 43364 ------------+------------------------------------ Total | .80749104 .39427379 60579
Prices around time of birth use "C:\Users\Cameron Campbe\Documents\Baqi\prices\Annual logged low sorghum.dta" rename YEAR BIRTHYEAR sort BIRTHYEAR generate allosorg5 = allosorg[_n-2]+allosorg[_n-1]+allosorg+allosorg[_n+1]+allosorg[_n+2] save "Logged low sorghum prices around time of birthyear“ use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge m:1 BIRTHYEAR using "C:\Users\Cameron Campbe\Documents\Baqi\prices\Logged low sorghum prices around time of birthyear", keep(match master) generate age_group = 5*int((AGE_IN_SUI-1)/5)+1 keep if PRESENT == 1 & NEXT_3 == 1 & AT_RISK_DIE == 1 & AGE_IN_SUI >= 1 xi:logit NEXT_DIE i.age_group allosorg5 if SEX == 2 & AGE_IN_SUI >= 56 & AGE_IN_SUI <= 75 xi:logit NEXT_DIE i.age_group allosorg5 if SEX == 1 & AGE_IN_SUI >= 56 & AGE_IN_SUI <= 75
xi:logit NEXT_DIE i.age_group allosorg5 if SEX == 1 & AGE_IN_SUI >= 56 & AGE_IN_SUI <= 75 i.age_group _Iage_group_1-201 (naturally coded; _Iage_group_1 omitted) Iteration 0: log likelihood = -16182.212 Iteration 1: log likelihood = -15874.613 Iteration 2: log likelihood = -15864.592 Iteration 3: log likelihood = -15864.584 Iteration 4: log likelihood = -15864.584 Logistic regression Number of obs = 41779 LR chi2(4) = 635.26 Prob > chi2 = 0.0000 Log likelihood = -15864.584 Pseudo R2 = 0.0196 ------------------------------------------------------------------------------ NEXT_DIE | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Iage_gro~_6 | (omitted) _Iage_gr~_51 | (omitted) _Iage_gr~_56 | -.9623698 .0432435 -22.25 0.000 -1.047125 -.8776141 _Iage_gr~_61 | -.6752047 .0438673 -15.39 0.000 -.761183 -.5892265 _Iage_gr~_66 | -.2201522 .0438849 -5.02 0.000 -.3061651 -.1341393 _Iage_gro~71 | (omitted) _Iage_gr~201 | (omitted) allosorg5 | -.0234833 .0087169 -2.69 0.007 -.0405681 -.0063984 _cons | -1.422218 .0437954 -32.47 0.000 -1.508055 -1.33638 ------------------------------------------------------------------------------ .
xi:logit NEXT_DIE i.age_group allosorg5 if SEX == 2 & AGE_IN_SUI >= 56 & AGE_IN_SUI <= 75 i.age_group _Iage_group_1-201 (naturally coded; _Iage_group_1 omitted) Iteration 0: log likelihood = -18151.951 Iteration 1: log likelihood = -17845.276 Iteration 2: log likelihood = -17836.418 Iteration 3: log likelihood = -17836.412 Iteration 4: log likelihood = -17836.412 Logistic regression Number of obs = 43633 LR chi2(4) = 631.08 Prob > chi2 = 0.0000 Log likelihood = -17836.412 Pseudo R2 = 0.0174 ------------------------------------------------------------------------------ NEXT_DIE | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Iage_gr~_56 | -.9278634 .0410099 -22.63 0.000 -1.008241 -.8474854 _Iage_gr~_61 | -.6300167 .041833 -15.06 0.000 -.7120079 -.5480255 _Iage_gr~_66 | -.2499452 .0431567 -5.79 0.000 -.3345308 -.1653597 allosorg5 | -.0302716 .0082515 -3.67 0.000 -.0464442 -.0140989 _cons | -1.305654 .0422039 -30.94 0.000 -1.388373 -1.222936 ------------------------------------------------------------------------------