340 likes | 534 Views
A Survivor's Guide to Survival Analysis. Dr. Beckie Hermansen Presentation given October 2007 (NV) Rocky Mountain Association for Institutional Research (RMAIR). Theoretical Base. EXCHANGE Of student’s time, efforts, knowledge for education offered by the institution. Student. Institution.
E N D
A Survivor's Guide to Survival Analysis Dr. BeckieHermansen Presentation given October 2007 (NV) Rocky Mountain Association for Institutional Research (RMAIR)
Theoretical Base EXCHANGEOf student’s time, efforts, knowledge for education offered by the institution Student Institution • Explicit Contracts and Implicit Contracts • Little or no guarantee = uncertainty
Uncertainty Persistence and Graduation Uncertainty Anticipatory Socialization
Registration Registration Graduation Graduation Socialization process marked by lower levels of uncertainty and lowered risk of pre-mature departure Anticipatory Socialization (Orientation) Postsecondary Socialization Socialization for students not participation in an orientation Socialization process marked by high levels of uncertainty and increased risk of exit from the institution Socialization for students participating in an orientation
First Semester GPA First Year GPA Departure over time Graduation Rates Transfer Rates Survival over time Persistence Study Model CollegePersistence
Survival over time Departure over time Research Question • Do orientation non-participants experience greater withdrawal at the end of the first year than participants?
Descriptive Statistics (N = 1143) • 587 Start Smart; 556 non-Start Smart • 731 (64%) female; 408 (36%) male • Average age = 19 • White = 93.7% • Average family contribution = $6,447 • Average High school GPA = 3.4 • Average ACT score = 20.65 • 50.1% declared a major at matriculation • 52.6% received a degree • 34% transferred to a higher educational institution
Withdrawal over time Hypothesis: Withdrawal over time for orientation non-participants = withdrawal over time for orientation participants (no difference). • Survival Analysis • Dependent Variable = Time and Status -- for this cohort there were 12 time intervals or semesters, excluding summer terms-- status was either censored (no event) or uncensored (terminating event) • Independent Variables = ~ Age ~ Gender ~ Ethnicity ~ Income Level ~ High School GPA ~ ACT Score ~ Start Smart Participation
Survival-Time Analysis ~ Logistic regression does not deal well with sample attrition ~ Unique characteristic of “stop-out” from college/university (Mission, marriage, maternity, money, mobility, mental health, miscellaneous). ~ Examine distributions given a time period between two events (matriculation and graduation) ~ Life-Tables, Kaplan-Meier, and Cox Regression analysis
Suvival Steps • Establish the main cohort • Determine the variables/coding • Determine Censored/Uncensored • Determine time to departure • Establish the vector cohort • Run the Analysis • Interpret the results
Establish the main cohort • First-time freshman students (full/part) • Include a summer and fall start • Characteristics
Determine the Coding • Variables = 0 or 1 depending on type or range. • Gender (0 for female, 1 for male) • Ethnicity (0 for minority, 1 for non-minority) • High School GPA (0 for below the cohort average, 1 for above) • Major (0 for non-declared, 1 for declared) • Income (0 for below the cohort average, 1 for above) • ACT (0 for below the cohort average, 1 for above) • Age (0 for below the cohort average, 1 for above) • Orientation Enrollment (0 for not enrolled, 1 for enrolled)
Determine Censor/Uncensor • Uncensored = experienced a terminal event (1). • Stop-out/Drop-out • Transfer • Graduation • Censored = continuing in the study/still “alive”(0). • Eventually (or by the 12th semester) all students experience a terminal event and become censored. Censored and Uncensored can be a bit confusing. What helps it to keep in mind the main intent of the study or treatment. You are looking at departure or termination, so those student who experienced that “desired” event are UNCENSORED compared to those student who never experienced the terminal event.
Determine Time to depature • This is measured in by term or semester with the first semester being 1 proceeding through the 12th semester (12). • If a student enrolled and completed one year of college (two full semesters), then their time to departure was 2. They departed at the end of the 2nd semester. • If a student only attended fall semester, then the time to departure was recorded as 1. • No students remained in the study until the 12 semester (2-year school vs. 4-year program). • Uncensored students (departing) • This was completed first from graduation or transfer information. • Censored students (continuing) • These were completed based on that last term of credit hours attempted, credit hours earned, and term GPA.
Establish the Vector cohort • This is a file that contains only the coded information (0 or 1) on the data you wish to analyze.
Run the analysis • Kaplan-Meier survival probabilities: • Survival and Hazard rates: Proportion of the initial cohort surviving through each time period of the study. And, the risk to departure for the initial cohort at each time period in the study. • Median life statistic: When 50% of the initial cohort had experience a terminating event. • Log-Rank Statistic: A chi-square value that determines the similarity of the survival curves (significant or non-significant). If the log-rank statistic detects a difference between the two curves, then the null hypothesis that the two curves are similar is rejected.
aContinuing are censored students who did not have a terminal event (i.e. transfer or graduation). bTerminal events are marked by students who were uncensored by transfer or graduation. cMedian survival time = 4.0 semesters (enrolled); 4.0 semesters (not enrolled)
Log-Rank Statistic: For this cohort the log-rank was .628 (p = .428). This works the same as an F-value and indicates no significantly detectable difference between the Orientation and non-Orientation curves. Note: It is difficult for the log-rank to detect a difference when the survival curves cross.
Run the analysis • Cox-Regression: • Performs all the same function as Kaplan-Meier; however, allows for the presence of other predictor variables (co-variates). • It handles all censored cases and provides coefficients for each of the covariates in the study. • Cox-Regression also allows for the analysis of multiple covariates (acting together) by producing a correlation coefficient
Ho: Non-Participant withdrawal = participant withdrawal Cox Regression with Variables Variables in the Equation Reference Groups for the Cox-Regression Analysis adhere to those parts of the sample group coded “1”. For example, in this study, significance was found for GENDER01. Since males were coded as “1” then they form the reference group. The β of -.393 indicates a negative impact or an inverse relationship – males were less likely to persist through college compared to females. The same applies to the INCOME (TINC01) category with above average income being coded as 1. Lower income students were more likely to persist.
Ho: Non-Participant withdrawal = participant withdrawal Cox Regression with Interaction Terms Variables in the Equation The ability to mix co-variates is most unique to Cox-Regression analysis. Here the combination of Orientation with Gender as well as Orientation with Income proved to be insignificant. This suggests that the Orientation experience does not have that much effect on persistence over time compared to Gender (alone) and Income (alone).
Interpret the Results • Kaplan – Meier survival probabilities (see handout) • Mean Life statistic: • Start Smart = 4.4 semesters/ non-Start Smart = 4.2 semesters • Hazard Probabilities: (see handout) • Log-Rank Statistic: • Log-Rank value = .628 (α = .428) . . . not significant. It is difficult for the log-rank test to find a difference when survival curve lines cross, as was the case in this study. In the absence of a significant log-rank statistic, reliance on graphical representation of survival curves and associated survival probabilities is paramount.
Interpret the Results • With betas of -.346 for gender and -.393 for income (p = .000), persistence significance was found for female students with lower than average family income contributions. • No significance was found for high school GPA, gender, or Start Smart participation, even with interaction terms. • In fact, high school GPA did not have a significant influence on persistence beyond the first year of college. • This confirms the Kaplan-Meier findings (similar curves). • Start Smart was not a factor in long-term student persistence: participants and non-participants experienced equal or close to equal termination and persistence rates over time.
Interpret the Results • Ho: There is no difference between persistence over time between non-orientation and orientation participants. The program has no effect (fail to reject Ho) • Since the survival curves (Kaplan-Meier) as well as the log-rank statistic provide no detectable difference between the survival curves of both groups . . . . AND given the correlation co-efficients from the Cox-Regression analysis; we fail to reject the null hypothesis. • Orientation participation did not have an effect on persistence over time.
Interpret the Results • Correlation on Graduation Rate and Group (r = .185, α = .01) Start Smart students graduated almost 2 to 1 (1.7) compared to non-Start Smart students at the 4 the semester.
Repetitive Findings (N = 4,536) • Survival Analysis • Non-significant log rank values (survival curves are similar) • 2003 females = higher persistence (more in the study) • 2001 was significant (log rank = 16.007, α = .001). Median survival for Start Smart = 4 semesters; non-Start Smart = 3 semesters. • Gender was the greatest predictor of persistence (females). • Graduation Rates • Same observed pattern • Start Smart average = 263 compared to 164 (1.7 :1) • Combined correlation 2.4% (r2 = .0243) toward Start Smart and graduation
Conclusions No significant relationship existed between Start Smart participation and long-term survival or persistence. However, in terms of student success orientation students did graduate at a greater rate than non-orientation students.
Recommendations • Survival analysis applied to other intervention/retention programs • Remedial education programs • Upward Bound-type programs • Early college programs • Sports/Intramurals/Student Leadership • Interactions between predictors and time: • Look at each predictor over time • Determine transient or permanent effects • Re-fine the time variable (Time < 5 semesters, graduation effect) • Missionary effect: • Allow for re-entry either with original cohort or existing cohort • Allow for part-time student analysis/study • Survival analysis in terms of student decision-making • Variables affecting decisions to withdraw or persist over time
So Why all the Trouble? • Survival analysis provides a picture of persistence over time (logistic regression is limited in the process). • Survival analysis run with other analysis provides greater insight into the student departure characteristics (future study about decision-making). • Survival analysis’ survival and hazard probabilities are great for cohort tracking, graduation rates, and transfer rates. • Survival analysis is a model that is easily duplicated (switch Orientation to Intramurals or Remedial Math etc.). • Survival analysis is do-able!