1 / 12

Predictive Analytics Proof of Concept (POC) September 2014

This document provides additional information on the Sacramento State University's predictive analytics Proof of Concept (POC) presented at the EDUCAUSE 2014 Poster Session. The POC aims to provide predictive insights for strategic issues, demonstrate the capability of predictive analytics, develop expertise with IBM SPSS Modeler, and identify data gaps and next steps for deploying a predictive analytics solution.

rsanchez
Download Presentation

Predictive Analytics Proof of Concept (POC) September 2014

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predictive Analytics Proof of Concept (POC)September 2014

  2. Additional Information on the Sac State Predictive Analytics POC ECUCAUSE 2014 Poster Session

  3. Proof of Concept (POC) Objectives • Provide predictive insights for a university-wide strategic issue/program (e.g. student success and student retention) • Demonstrate the capability of predictive analytics for broader application • Develop expertise with IBM SPSS Modeler in partnership with the vendor and key campus leaders • Identify gaps in the data and next steps for architecting and deploying a predictive analytics solution

  4. Predictive Analytics Journey Indicators: • Enroll full time • Earn summer credits • Complete a college success course or first-year experience program Milestones: • First semester grade point average • Second semester grade point average Subset of Indicators and Milestones identified by the Institute for Higher Education Leadership and Policy (IHELP): “Student Flow Analysis: CSU Student Progress Toward Graduation” *

  5. SPSS Modeler “Stream” Using Factors from Published Study

  6. Inside the “Super Node” Additional Data Prep

  7. “Auto Prep” Option in SPSS Modeler Choose Speed, Accuracy, or Manual

  8. How Good was the POC Model?

  9. SPSS Modeler Predictions for Each Student in Cohort

  10. Predictive Analytics POCLessons Learned • Learned how to use IBM SPSS Modeler • Data Prep is key – and time consuming! • Consider moving some of the Data Prep to the ETL layer (i.e. model the data so it can easily be used at input for analytics) • You must “know your data” • You must be familiar with statistical methods to prep the data properly and to understand the results • Optimal Predictive Analytics Project Team: Data Modeler, BI Analyst, Subject Matter Expert from functional area, and Data Scientist • Correlation vs. Cause • The output may be one step in developing advising programs, identifying advising cohorts or for advising individuals; however, caution should be taken in directly advising a student based on one predictive model looking 5 years out • Predictive analytics is an on-going, iterative process • There is an opportunity to write the predicted outcomes to the data warehouse and use them to track the usefulness of the model and to create dashboards to track the success of resulting programs

  11. Additional POC Work • In addition to focusing on the IHELP indicators and milestones, several models using a broader set of data from the data warehouse were developed • Experimented with different cohort years and different targets • Used IBM SPSS Modeler to develop a basic POC Faculty Retention Model • Used IBM SPSS Modeler for descriptive analytics for AD ASTRA event and scheduling data

  12. Predictive AnalyticsNext Steps • Link to campus strategic plan, identify an opportunity for predictive analytics to contribute to its success, and build target models with the “optimal team” as described previously (tight collaboration with campus functional areas) • Continue to develop models focused on student success, but explore other areas such as university advancement, scheduling, etc. • Move from using flat file extracts to connecting IBM SPSS Modeler directly to the data warehouse • Develop data models and ETLs to better prep the data for predictive analytics and data mining • Identify missing data or data gaps and close the gaps if possible with the data that we have • Partition data to develop the model on a subset of data and then test its predictive power on the remaining set • Continue to learn and build expertise with SPSS Modeler and its capabilities as well as continue to build expertise in statistical methods in general

More Related