510 likes | 521 Views
Sequential Logistic Regression: Modeling Risk Factors and Child Outcomes. Presented to NIC Chapter of ASA October 21, 2005. Logistic Regression Model. Statistical method for relating explanatory variable(s) to the log odds of a binary outcome measure.
E N D
Sequential Logistic Regression: Modeling Risk Factors and Child Outcomes Presented to NIC Chapter of ASA October 21, 2005
Logistic Regression Model • Statistical method for relating explanatory variable(s) to the log odds of a binary outcome measure. • Dependent variable is always a binary outcome. • Independent variables may be categorical or quantitative.
Logistic Regression Model Log of the Odds Ratio • p is the probability associated with the binary outcome measure. • eß1 is the odds ratio for independent variable x1. • Odds ratio (eß1) being the amount of increase in the odds associated with a unit increase in x1.
Statistical Inference for Logistic Regression • The confidence interval for the slope b1 is • The confidence interval for the odds ratio is • Where z is the value from the standard normal density curve.
Statistical Inference for Logistic Regression • To test the hypothesis Ho: ß1 = 0 we compute the test statistic • Which has approximately a Chi-Square distribution with 1 df.
Logistic Regression with One Predictor1 • Assume in a large sample of college students, those who frequently engage in binge drinking are 3,314/17,096 = 0.1938. • Odds for a for this outcome are thus: This example borrowed from introduction to the Practice of Statistics by Moore and McCabe (2006).
Is Gender a Predictor? • Odds Males: • Odds for Females: Log Odds: Log Odds:
Interpreting the LogReg Model • Model for this example is: • For females (x1= 0) we have: • Thus the estimate of the intercept is equal to ß0 which is the log odds for females.
Interpreting the LogReg Model • The estimate of the slope is the difference between the log odds for males on the predictor and the log odds for females on the predictor: • The fitted model is: log(ODDS)=-1.59 +0.36x
Meaning of the Odds Ratio • The odds ratio is: • Interpretation: the odds of being a frequent binge drinker for males is 1.43 times the odds for females.
Multivariate Logistic Regression • The multivariate case has the same statistical concepts but the computations are more difficult because of the potential correlation among multiple predictors. • It is easy to conduct the analysis using a statistical software package.
Overview of Study • Children grow up within the context of personality, family, neighborhood, and society. • They grow up with both disadvantages and opportunities, problems and strengths, referred to here as risk and protective factors. • Examples of commonly understood risk factors include low birth weight, child maltreatment, illness, neighborhood violence. • Examples of commonly understood protective factors include individual verbal communication skills, the capacity for empathy, problem solving skills, frustration tolerance, the presence of multiple and consistent caregivers, access to health care and social services, and the concrete, social, and affective support of family and friends. • The aim of this study was to empirically measure risk and protective factors at the individual, family, and neighborhood level and to relate them to poor short- and longer-term outcomes such as health problems, behavioral and cognitive development, and maltreatment.
Methods -- Subjects • The 219 mother-infant dyads recruited for this study were part of a larger cohort recruited in waves over four years, beginning in 1990 as part of the Capella Project, a twenty year longitudinal study funded by NIH. • Data used in the current analysis were collected over a period of approximately 4-5 years. • Infants in the study were all under 18 months of age when they entered the study.
Methods -- Instruments • Extensive information was collected during the primary maternal interview. • The main tools were the interview and self-report inventories. • Combination of study-developed and standardized instruments. • Maternal Information • Use of alcohol and drugs. • Physical and psychological health. • Personal history of physical, sexual and emotional abuse. • Family functioning and daily life stressors. • Neighborhood conditions. • Child Information • Behavior. • Health, accidents, hospitalizations. • Cognitive and emotional development. • Child maltreatment • Abuse or neglect in the child’s first year of life, obtained from an annual review of hotline records of reports, and supplemented by case record review
Caregiver Intra-Personal Functioning • CAGE—4 item rapid alcoholism screening scale. Subjects were classified as having a possible alcohol problem if they endorsed 2 or more items. • Center for Epidemiologic Studies Depression Scale—20-item scale to measure depressive symptoms. Clinical cut-off score of 16 used here. • Health Opinion Survey—20 item scale to assess neurotic or psychosomatic symptoms. Higher scores indicate more symptoms. A binary measure was computed using a median split, to reflect above-average psychosomatic symptoms. • Service Utilization – report of a psychiatric or substance use hospitalization.
Caregiver Inter-Personal Functioning • Family and Neighborhood—The family APGAR is a 5-item inventory of family function and satisfaction. The Neighborhood Satisfaction Index is a 9-item inventory of neighborhood characteristics. • Domestic Violence was defined by self-report in conjunction with questions regarding childhood physical, sexual and emotional abuse, and was further confirmed as current by interviewer in the site-specific Trauma and Violence scale. • Lifetime Stressors – An inventory of common stressors such as marriage, divorce, death in the family, moving, experiencing violence, etc.
Child Short-Term Outcomes • Child Health Status— items to assess general health, specific conditions applying to child and other illness or problems. • Service Utilization Measures—to assess accidents and hospitalizations of the child. • Child Abuse Neglect Tracking System—abuse or neglect in the child’s first year of life, obtained from an annual review of hotline records of reports, and supplemented by case record review. • Battelle Developmental Inventory Screening Test—96 items (out of 341 in complete battery) to assess five domains: personal-social skills, adaptive behavior, psychomotor ability, communication and cognitive. Child considered to have delayed development if (standardized) Battelle total score more than 1 standard deviation from the mean.
Child Long-Term Outcomes • Child Health – itemsassessing general health, specific conditions applying to child and other illness or problems through caregiver report. • Child Behavior Checklist – 5 scale scores assessing a child’s behavioral and social development. • PRESS – A measure of intelligence for pre-school children.
Hypotheses • The theoretical model guiding the analyses posited a sequential model of the effect of certain risk factors on child developmental outcomes. • These risk factors were: • Maternal history of loss and/or victimization. • Maternal compromised emotional status. • Domestic violence. • Family and/or neighborhood problems.
Hypotheses • Maternal history of loss/victimization would be associated with maternal compromised emotional status. • Maternal compromised emotional status would be associated with problems in the family and neighborhood and/or domestic violence. • Problems in the family and neighborhood and domestic violence would be associated with poor short-term child outcomes. • Poor short-term outcomes would be associated with poor longer-term child outcomes.
Visual Model of the Hypotheses Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Domestic Violence Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health
Measures used in Analyses • Maternal loss/victimization history coded yes (1) if the mother reported either a personal history of abuse or losing a parent before the age of 18. Coded no (0) otherwise. • Maternal compromised emotional status was coded yes (1) if the mother any of the following: • Score of 2 or higher on a 4-item rapid alcoholism screening inventory (CAGE). • Score above cutoff of 16 on the depression inventory (CESD). • Score on inventory of psychosomatic symptoms above the median. • Report of a substance or psychiatric hospitalization.
Measures used in Analyses • Problems in the family or neighborhood was coded yes (1) if the mother scored above the median on two or more of the following inventories: • Family function and satisfaction. • Neighborhood characteristics. • Lifetime stressors. • Domestic violence coded yes (1) if the mother reported domestic violence.
Measures used in Analyses • Poor short-term (1-2 Year) child outcomes was coded yes (1) if the child had any two of the following: • Health Problem(s), accident or hospitalization. • Delayed Development (BATTELLE). • Presence of Alcohol or Drugs at birth. • OR there was a report of abuse or neglect.
Measures used in Analyses • Poor long-term (3-4 Year) child outcomes was coded yes (1) if the child had any two of the following: • Health Problem(s), accident or hospitalization. • Delayed Development (PRESS). • Behavioral Problems (CBCL)
Logistic Regression #1 • Maternal /loss victimization history entered as a single predictor for maternal compromised emotional status. • This analysis was statistically significant (Chi-Square = 13.94, p < .001), and resulted in correct classification of 47% of cases without impaired caregiver status, 77% of cases with caregiver status problems and 68% of cases overall. • The odds ratio for the predictor (maternal victimization history) was 3.1, and the 95% CI (1.7 to 5.6).
The Model so Far Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Domestic Violence 3.1 Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health
Logistic Regression #2 • Maternal loss/victimization history and maternal compromised emotional status entered together as predictors for family/neighborhood problems. • This analysis was also statistically significant (Chi-Square = 16.17, p < .001), and resulted in correct classification of 60% of cases without family/neighborhood problems, 65% of cases with family/neighborhood problems, and 63% of cases overall. • The odds ratio for the maternal compromised emotional status as a predictor (family neighborhood problems) was 2.5, and the 95% CI (1.4 to 4.6). • The odds ratio for maternal victimization history was not statistically significant.
The Model so Far Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Domestic Violence 3.1 Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health 2.5
Logistic Regression #3 • Maternal loss/victimization history, caregiver status, and family/neighborhood problems entered in one step to predict presence of domestic violence in the home. • This regression was statistically significant (Chi-Square = 16.36, p < .001), and resulted in correct classification of 71% cases without domestic violence in the home, 51% of cases with domestic violence in the home, and 62% cases overall. • The odds ratio for the maternal compromised emotional status as a predictor (of domestic violence) was 2.1, and the 95% CI (1.4 to 4.6). • The odds ratio for family/neighborhood problems as a predictor (of domestic violence) was 1.8, and the 95% CI (>1.0 to 3.2). • The odds ratio for maternal victimization history was not statistically significant.
The Model so Far Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Domestic Violence 2.1 3.1 1.8 Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health 2.5
Logistic Regression #4 • Maternal loss/victimization history, caregiver status, family/neighborhood problems, and domestic violence entered in one step to predict presence of poor short-term child outcomes. • The overall regression was not statistically significant (Chi-Square = 8.98, p < .062), and classification was less effective. Under this model, all cases were classified into the poor short-term child outcome group, correctly classifying only those subjects who did in fact have poor short-term child outcomes (66%), and misclassifying all the rest. • The odds ratio domestic violence as a predictor (of poor short-term child outcomes) was 2.1, and the 95% CI (1.2 to 3.9). This was statistically significant. • The odds ratios for the other predictors were not statistically significant.
The Model so Far Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Domestic Violence 2.1 2.1 3.1 1.8 Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health 2.5
Logistic Regression #5 • Maternal loss/victimization history, caregiver status, family/neighborhood problems, domestic violence, and poor short-term child outcomes entered in one step to predict presence of poor longer-term child outcomes. • The overall regression was statistically significant (Chi-Square = 16.67, p < .005), and resulted in correct classification of 39% cases without poor long-term child outcomes, 85% of cases having poor long-term child outcomes, and 68% cases overall. • The odds ratio for family/neighborhood problems as a predictor (of poor long-term child outcomes) was 2.6, and the 95% CI (1.1 to 6.1). • The odds ratio for poor short-term outcomes as a predictor (of poor long-term child outcomes) was 3.2, and the 95% CI (1.4 to 7.6). • The odds ratios for the other predictors were not statistically significant.
The Final Model Maternal History Victim of Child Abuse Lost a Parent Short-Term Outcomes Child Abuse or Neglect AOD/Battelle Child Health Victim of Domestic Violence 2.1 2.1 3.1 3.2 1.8 Family & Neighborhood FAPGAR Life Experiences Neighborhood Short Form Compromised Emotional Status CageA/CES-D Health Opinion Survey Residential Treatment Long-Term Outcomes Press/CBCL Battelle Child Health 2.6 2.5
Goodness of Fit • -2LL (LL = log likelihood) is 0 if model fits perfectly. • Chi-Square is test the change in -2LL from constant only to model with set of predictors.
Goodness of Fit • Quantification of the proportion of explained variance. • Cox & Snell R2 & Nagelkerke R2 • These are similar in intent to R2 in multiple linear regression. • For the current model, about 19.5%.
Discrimination and Calibration • Model Discrimination • Ability of the model to discriminate observations in the two groups. • Model Calibration • How close the observed and predicted probabilities match.
Model Discrimination • SPSS provides a classification table. • Shown earlier. • SPSS also provides a histogram of estimated probabilities. • Positive cases should be on the right and negative cases on the left.
Model Discrimination not so good one serious problem is the sample itself was quite biased towards poor outcomes because of poverty, etc.
Calibration • Hosmer-Lemeshow goodness-of-fit • Cases divided into deciles based on estimated probabilities. • Compare observed to expected numbers (contingency table) • Null hypothesis for this is there is no difference between the observed and predicted values. • This statistic should be interpreted carefully because it’s value is dependent upon the number of groups. • Interpretation should be cautious.
Hosmer and Lemeshow for Final Model null hypothesis is not rejected, suggesting the model is OK.
The c-Statistic • c-Statistic • Interpreted as the proportion of pairs of cases with different observed outcomes where the model results in higher probability for cases with the event than for cases without the event. • Ranges in value from 0.5 to 1.0, where 1.0 means the model always assigns higher probability to cases with the event than to those without the event. • In SPSS to get this you first have to save the predicted probabilities along with the actual outcome measure into a new file, and then group them into a reasonably large number of distinct groups using an equation like this: • probcat = trunc(prob_1/.00005) • Next cross tabulate probcat with the outcome measure and calculate Somers’ d.
c-Statistic • The c-statistic is interpreted as the % of possible pairs of cases in which one is positive on the outcome and the other is negative, that the logistic model assigns a higher probability to the positive case.