Predictive Model for Enhancing Students' Self-driven Engagement

Building a predictive model to enhance students' self-driven engagement Moletsane Moletsane T: +27(0)51 401 9111 | info@ufs.ac.za | www.ufs.ac.za

Overview • Introduction • Introduction and Motivation for a sensitivity tool • Data • Criteria for inclusion of variables • Variables used • Modelling • Random Forest modelling process • Evaluation of the model • The “what-if” tool.

What is student engagement? Student engagement measures provide information about: • What students do– time and energy devoted to educationally purposeful activities • What institutions do– using effective educational practices to induce students to do the right things With the aim of: • channelling student energy towards activities that matter.

what do we learn from se surveys? • In the absence of reliable indicators of actual student learning, SE surveys are “process indicators or proxies for student learning outcomes” (Banta, Pike, Hansen, 2009; Kuh, 2009)

Is se data shared with students? • Little use of student engagement data by students. • Similar for technology committees/groups in the institutions (NSSE, 2014)

How can we best share SE data to students? • In a manner that: • Guides students’ effective educational behaviours and encourages students to make more informed decisions regarding their learning • Reflects the students interest. • Does not violate students’ privacy • User friendly

How can we best share SE data? • Possible methods include: • Creating an annual report for students • Releasing snippets of data at certain time intervals (Social media, Posters, Email, SMSs) • Publishing SE articles in varsity magazines • Using SE data during the advising process, or • Providing students with aggregated data • Through a web based prediction tool that implements a model based on SE data .

What is the prediction tool? • A prediction model (We use a machine learning technique for the prediction modelling) • That is implemented in a web interface (Built in the R environment) • To make reactive predictions to students inputs on the tool • That allow students to: • Explore which educational behaviours lead to a higher chance of success, thus encouraging students to make more informed decisions regarding their learning. • Ask what if questions, and then find answers

What data do we have? • Student Engagement data • UFS data from 2013 to 2016. • Biographical data • Institutional Data • Students’ outcome e.g. we use proportion of modules passed • Students’ credit and module load • Biographical data

Should we Include all the data? • Biographical Data • Since we intend on sharing the tool with students, we believe that biographical data may be interpreted in a prejudiced manner. E.g. Race, disability, or gender. • Non actionable data • For the purpose of the tool, some non actionable data was not included in the prediction model despite being modest predictors. E.g. Faculty, residence status

SASSE data • UFS data from 2013 to 2016 has 6213 respondents. • Only 4602 of the observations are matched to the institutional data. • 190 variables

How do we choose which variables to use? • Variable Importance • The machine learning technique we use has a built in variable selection method. • The method is based on cross validation principles for variables which ranks the variables by the loss of accuracy the model has when a model is implemented without that feature. • From the top ranking variables, we select the most predictive 8 variables for our method.

How do we choose which variables to use? • Variable Importance • The machine learning technique we use has a built in variable selection method. • The method is based on cross validation principles for variables which ranks the variables by the loss of accuracy the model has when a model is implemented without that feature. • From the top ranking variables, we select the best 5 variables for our interface.

Which variables are most important?

Algorithm • From 1 to K • Draw a bootstrap sample of size n from the data • Grow a random forest tree to the bootstrapped data by • Selecting m variables at random from the p variables • Pick the best variable split among the m variables • Split the node into two data nodes • Output the ensemble of trees • Make a final prediction based on the majority vote of ensemble

Overview of the random forest model New data Sample 1 Learning algorithm Classifier 1 Training data Combined classifiers Sample 2 Learning algorithm Classifier 2 Sample k Learning algorithm Classifier k Prediction

Model Resutls • Prediction with all (177) the variables sample (20.97%) • False positive rate = 20.8% • False negative rate = 21.08% • Prediction with the selected (8) variables sample (23.64%) • False positive rate = 24.3% • False negative rate = 23.5% Pred Actual Pred Actual

The tool (Part 1 of 2)

The tool (Part 2 of 2)

Thank you T: +27(0)51 401 9111 | info@ufs.ac.za | www.ufs.ac.za

Predictive Model for Enhancing Students' Self-driven Engagement

Predictive Model for Enhancing Students' Self-driven Engagement

Presentation Transcript

www.jcep.info

INFO 330

Learning Strategies and Coaching: Pathways to Self- Determination

IDENTIFY

Thanksgiving

Your instructor’s info: Rochelle Mitchell (you may address me as Rochelle or Ms. Mitchell)

Your instructor’s info: Rochelle Mitchell (you may address me as Rochelle or Ms. Mitchell)

INFO 330

Info-PubMed

What is INFO?

October 14, 2014 @ 7 pm Boeing Auditorium Jenny@globalbrigades

Name

SIP INFO Event Framework (draft-kaplan-sip-info-events-00)

Schedule your Car Air Conditioner Repair at Harharts Service Station

How Often Should You Get Your Car Tune Up Service near Northampton, PA?

Cincinnati Affordable Web Design

Cincinnati Web Design Companies

Is Your Quickbooks Running Slow?