400 likes | 490 Views
Applying Bayesian Belief Networks to the Examination of Student Outcomes. Xiaohong Li, Graduate Research Asst. Rita Caso, Director. Sam Houston State University Office of Institutional Research & Assessment. Outline. Purpose of the Study Why Study Freshman Outcomes? Why Bayesian Networks
E N D
Applying Bayesian Belief Networks to the Examination of Student Outcomes Xiaohong Li, Graduate Research Asst. Rita Caso, Director Sam Houston State University Office of Institutional Research & Assessment
Outline • Purpose of the Study • Why Study Freshman Outcomes? • Why Bayesian Networks • Method • Example Inferences • Conclusions TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Purpose • Apply Bayesian Belief Network(BBN) techniques to examine student outcomes for the purpose of identifying families of factors associated with students’ college success at Sam Houston State University (SHSU) • Identify what factors impact retention and graduation for First Time Freshmen (FTF) • Retention and Graduation rates: key performance indicators • Providing management information, analyzing and interpreting these data for using in planning and policy decisions TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Why Study Freshman Outcomes? • To determine if we are providing the best environment & experiences to promote success for our diverse freshman population • To make tailored improvements in the learning environment and the learning experiences we offer in order to maximize successful outcomes for all students across preparation backgrounds, needs, learning styles and life-styles • To satisfy external accountability requirements TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Why Study Freshman Outcomes? • University Stakeholders who need detailed insights into the conditions and combinations of factors that influence new student success: • Enrollment Management • Enrichment and Support Programs • Student Services • Academic Department TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Why Bayesian Network? • Graphical Model with an Associated set of Probability Tables • Learn causal relationships easily • Better understand the problem domain and predict the consequences • Flexible and robust recommendation strategies TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
About Bayesian Networks • Definitions of Basic Terms: • Independent • Event A does not affect the probability of B occurring: P( A, B) = P(A) * P(B) • Conditional probability • The probability of event C occurring, given that event A has already occurred: P(C|A) • Conditional Independence • E is independent of A and B given D • E and F are conditionally independent of each other, given D • Causal Theory • A or B can cause D to occur • Node: variable • Leaf Node: no outcome depends on them (E, F) • Root Node: do not depend on any outcome (A,B) A B D E F TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
About Bayesian Networks • “A graphical model that encodes probabilistic relationships among variables of interest” • Named “Bayes” after Reverend Thomas Bayes, a British theologian and mathematician who wrote down a basic law of probability • Bayes Rule Smoking Cancer TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
About Bayesian Networks • Bayesian Networks Contain: • A Network Structure: • Directed, acyclic (non-circular) graph • Encodes a set of conditional independence and dependence information about variables • Probability • Probability distributions associated with each variable • Represented in the data and computed from the data TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
About Bayesian Networks • Example of Bayesian Network • Example Data below is Invented Full/Part FAID Retention TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Method • Data Processing • Data Source: • Institutional Research & Assessment Office data files from which Fall FTF cohorts for 2000 through 2006 were extracted • Working Data File • Merge extracted FTF Cohort data into aggregated data file • Records=13542, variables =216 • Dependent variables - retention rate & graduation rate computed from enrollment and graduation variables in working data file • Discretization – transform continuous variables into categorical variables TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Method • Developing Bayesian Belief Network (BBN) Model by using a computer application program called NeticaTM3.25 • Selection of Variables • Input variables selected from commonly used in SHSU IRA Office studies of freshman outcomes • Variable selection reinforced by variables used in ‘ Data Mining with Bayesian Belief networks to Examine Retention and Graduation at a Public University’ by P. Edamatsu, D. Jankovic and Pokrajac, presented at AIR 2007 Forum TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Data Description TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Method • Assumptions in the model Structure • Graduation and Retention (Dependent Variables) are “leaf nodes” • Gender, Ethnicity, Full/Part, Probation & Suspension (PBSP) are “root nodes” and are independent of each other. TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Method • Building the Model Structure • In order to specify the relationships between the selected variables from PRIOR information, I took inspiration from: • Structure used by Edamatsu, D. Jankovic and Pokrajac in their study • Knowledge about variables related to dependent outcome variables from other SHSU IRA Office studies • Knowledge about relationships between pairs of variables from correlation matrices that included all selected variables TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Structure Encoded with Data Probability TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Main Results • Posteriori Analysis • Students’ gender determines students’ college choice and high school rank • Ethnicity influences students’ college choice. • 1 year retention rate and 6 year graduation rate directly depend on GPA and students’ probation or suspension status • Students’ in-state or out–of-state status and ethnicity related to how many years after high school graduation students applied to the university • Students living on campus perform a little bit better than those living off campus TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Gender TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Gender TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Gender • There is no significant difference in graduation rate and retention rate between males and females. More females’ high school ranks are above the 1st Q (from the top) than males • Females • Tend to study majors in college of Art & Sciences and Humanities & Social Sciences • Males • Tend to study majors in college of Art & Sciences and Business Administration. TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Ethnicity TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Ethnicity TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Ethnicity TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Ethnicity TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Ethnicity • No significant difference in graduation rate and retention rate among ethnicities • Native Americans are less likely (86.7%) to attend university within 1 year after high school compare to other ethnicities (around 95%), and 91% are in-state students, while 99% of other ethnicities are in-state. • 46.6% of White Americans enrolled in college of Arts and Sciences, compare to 39% of other ethnicities. • 94% of African Americans live on campus, compare to 75% - 86% of other ethnicities. TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to GPA TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to GPA TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to GPA TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to GPA • Bearkat Learning Community students have a higher probability of having a higher GPA • Students with low GPA (below 2) • Have only 27% graduation rate and 55% 1 year retention rate • Students with higher GPA (2 to 2.5) • Have 43% graduation rate and 75% retention rate • Students with highest GPA (above 3.75) • Have 70% graduation rate and 85% retention rate TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Probation and Suspension TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Probation and Suspension TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Probation and Suspension TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
In-State/Out-of-State Status TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
In-State/Out-of-State Status TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
On / Off Campus Living TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
On /Off Campus Living TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Results Pertaining to Probation and Suspension, In or out of State and Living on or off Campus • Students on probation or suspended in the first year • Have only 22% graduation rate and 45% retention rate • Good standing students • Have 53% graduation rate and 76% retention rate. • Out-of-state students are less likely (87%) to attend university within 1 year after high school, compared to in-state students (95%). • There are no GPA distribution differences between in-state students and out-of-state students • Students living on campus have a slightly higher GPA, retention rate and graduation rate. TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Conclusion • Bayesian Belief Networks are good tools for analyzing institutional research data • BBN is a powerful methodology for graphically demonstrating probability theory and can provide good references for university administration • Users could have difficulty using BBN if they do not have sufficient data or theory base to provide prior probabilities. This is particularly problematic when exploring a previously unknown network • The validity and reliability of prior beliefs used in Bayesian inference processing are critical. If this prior knowledge is not reliable, then the Bayesian network is not useful TX Association of Institutional Research (TAIR) 2008 Conference, 2/5-7/08
Bibliography • P. Edamatsu, D. Jankovic and Pokrajac, Data Mining with Bayesian Belief networks to Examine Retention and Graduation at a Public University, presented at AIR 2007 Forum • David Heckerman, A Tutorial on Learning with Bayesian Networks, 1997 • Bruce G. Marcot, What Are “Bayesian Belief Network Models?”, 2005 • Castillo, E., J.M.Gutierrez and A.S.Hadi Expert Systems and Probabilistic Network Models. Springer Verlag, 1997 • Jie Cheng, Russell Greiner, Learning Bayesian Belief Network Classifiers: Algorithms and System 1995