1 / 38

‘Hark Who Goes There?’: Developing a Predictive Model of Student Enrolment

‘Hark Who Goes There?’: Developing a Predictive Model of Student Enrolment. Dr Helena Lim, Dr Rhod Davies and Dr Steve Jackson June 2008. Background Developing a predictive model at SSU Identifying the variables Getting the data Making sense of the data Explaining logistic regression

maine
Download Presentation

‘Hark Who Goes There?’: Developing a Predictive Model of Student Enrolment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ‘Hark Who Goes There?’:Developing a Predictive Model of Student Enrolment Dr Helena Lim, Dr Rhod Davies and Dr Steve Jackson June 2008 IR Conference 25 June 2008

  2. Background Developing a predictive model at SSU Identifying the variables Getting the data Making sense of the data Explaining logistic regression Modelling strategy Variables for consideration in the model Selecting a model Running the model/s Interpreting the model Lessons learnt and next steps Session overview IR Conference 25 June 2008

  3. Introduction of variables fees in HE sector in 2006 Access agreements: bursaries UK HEIs beginning to grapple with and understand the implications of a ‘fees market’ in HE Summer 2006/7: Bursaries research commissioned by VCO Aim: to find out about student perceptions and understanding of bursary packages on offer Background IR Conference 25 June 2008

  4. Siefert and Galloway (2006) developed an institutional probability model that they claim predicts an individual student’s financial ‘tipping point’ based on a number of variables 4 years’ of institutional admissions and financial aid data (1998-2001, n=13,308 admitted students) Used logistic regression to calculate actual amount of award that may positively influence the student’s decision to enrol Predictive models on student price sensitivity IR Conference 25 June 2008

  5. Hatt, Hannan and Baxter (2006) compared the performance of 2 groups of students (bursary/non-bursary) in 2 post-92 institutions (n=6,201) Findings: Bursaries can build a positive relationship between the individual and the HEI Bursary holders felt that the money made a difference and demonstrated attitudes to study similar to mature students Higher continuation rate beyond first year of study for bursary students compared to non-bursary students Bursaries can have positive effects in relation to learner achievement and continuation UK research on bursaries and student success IR Conference 25 June 2008

  6. Developing a predictive model at SSU • Predictive modelling: Using past performance data to predict future results The key to the success of a predictive model is good quality data… If no data are available, then the predictive modelling process can’t be undertaken Parrott, 2007 IR Conference 25 June 2008

  7. Identifying the variables IR Conference 25 June 2008

  8. Getting the data… IR Conference 25 June 2008

  9. …still getting the data…. Student Record System Finance system Our database IR Conference 25 June 2008

  10. … and still getting the data IR Conference 25 June 2008

  11. Making sense of the data • Existing data: • Collected for a different purpose • Gaps (eg. ethnicity) • Cleaning (eg. age, apparently students of 1 and -37!) • Coding/recoding (eg. postcodes) IR Conference 25 June 2008

  12. Explaining logistic regression A method for predicting the outcome of a dependent dichotomous variable based on a series of independent variables (maybe nominal, ordinal or scale) Variables – e.g. age, gender, location, UCAS offer, etc Black Box Probability of enrolling Based on Chan (2004) IR Conference 25 June 2008

  13. IR Conference 25 June 2008

  14. Modelling strategy Hosmer & Lemeshow (2000) A] Univariate analysis B] Variables where p-value < 0.25 IR Conference 25 June 2008

  15. IR Conference 25 June 2008

  16. IR Conference 25 June 2008

  17. Variables for consideration in the model • Socio-demographic • Age on entry • Gender • Postcode • Previous institution type • Previous institution locality • Institutional structure • Programme group • Faculty • Application process • Timing of application • SSU decision turnaround time • Applicant decision turnaround time • Inducements • UCAS tariff offer level • ALL (except gender) were significantly related (p<0.001) to enrolment status IR Conference 25 June 2008

  18. Importance of variables in the analysis Number of applicants in the analysis (data loss) Representativeness of applicants in the analysis (generalisability of results) Selecting a model – Considerations IR Conference 25 June 2008

  19. Selecting a model – Number of applicants in the analysis IR Conference 25 June 2008

  20. Selecting a model – Characteristics of variables in the analysis IR Conference 25 June 2008

  21. Selecting a model – Characteristics of variables in the analysis IR Conference 25 June 2008

  22. Selecting a model – Characteristics of variables in the analysis IR Conference 25 June 2008

  23. Selecting a model – Characteristics of variables in the analysis IR Conference 25 June 2008

  24. Selecting a model – Characteristics of variables in the analysis IR Conference 25 June 2008

  25. Importance of variables in the analysis Wanted to look at the effect of inducements (i.e. UCAS tariff offer level on likelihood of enrolment) Run Model B Number of applicants in the analysis (data loss) Far fewer applicants lost in the analysis when UCAS tariff offer level not used Run Model A Representativeness of applicants in the analysis (generalisability of results) Greater similarity in variable characteristics between data used in Model A and the total dataset Run Model A Overall Conclusion Run BOTH Model A and Model B Selecting a model – Conclusion IR Conference 25 June 2008

  26. Running the models – Steps Identify variables in the model Run Logistic Regression using SPSS v15.0 Identify & remove variables that are collinear Identify & remove applicants who have unusual values Identify & remove applicants who unduly influence the regression model Run Logistic Regression using SPSS v15.0 again (with reduced dataset) Make baseline categories equivalent to institutional average IR Conference 25 June 2008

  27. Running the models – Number of applicants in the analysis IR Conference 25 June 2008

  28. Interpreting the models – Variables important in predicting enrolment IR Conference 25 June 2008

  29. Interpreting the models – Age on entryModel AModel B • The odds of enrolling increases significantly with age IR Conference 25 June 2008

  30. Interpreting the models – PostcodeModel AModel B • Compared to Southampton, the odds of enrolling are significantly lower for applicants in all other postcode areas (apart from Basingstoke) • Compared to Southampton, the odds of enrolling are significantly lower for applicants in Portsmouth and the rest of the UK IR Conference 25 June 2008

  31. Interpreting the models – SSU Programme GroupModel AModel B • Compared to the institutional average, the odds of enrolling is significantly higher in the HSW programme group and significantly lower in the ACT and BMT programme groups • Compared to the institutional average, the odds of enrolling is significantly higher in the BF, FAV, MEW and FTP programme groups and significantly lower in the LEI, ECO, ACT, BMT and BGE programme groups IR Conference 25 June 2008

  32. Interpreting the models – Timing of applicationModel AModel B • Compared to On time applications, the odds of enrolling are significantly higher for Clearing and Deferred applications • Compared to On time applications, the odds of enrolling are significantly higher for Deferred applications IR Conference 25 June 2008

  33. Interpreting the models – Applicant decision turnaround timeModel AModel B • For BOTH models the odds of enrolling decreases significantly as it takes applicants longer to make a decision IR Conference 25 June 2008

  34. Interpreting the models – UCAS tariff offer levelModel AModel B • Compared to applicants offered 40pts, the odds of enrolling are significantly lower for applicants offered 100pts IR Conference 25 June 2008

  35. Interpreting the models – How accurately is enrolment status predicted?Model AModel B • 98.2% of non-enrolled applicants were accurately identified • 10.3% of enrolled applicants were accurately identified • 95.6% of non-enrolled applicants were accurately identified • 20.2% of enrolled applicants were accurately identified IR Conference 25 June 2008

  36. Initial conclusions Approach appropriate but limited because of ‘quality’ of data Model explains enrolment patterns (to an extent) Run in second year of data To test against current model Separate years: Compare years Together: larger sample Progression modelling More data available (eg.ethnicity, socio-economic status, etc) Lessons learnt and (feasible) next steps… …where thither? IR Conference 25 June 2008

  37. Chan, Y. H. (2004) Biostatistics 202: logistic regression analysis. Singapore Med J 45(4), 149-153. Hatt, S., Hannan, A. and Baxter, A. (2005) ‘Bursaries and Student Success: a Study of Students from Low-Income Groups at Two Institutions in the South West’ Higher Education Quarterly59 (2) , 111–126. Hosmer, D.W. and Lemeshow, S. (2000) Applied logistic regression. 2nd ed. New York: Wiley. Parrott, S. ‘Tuition Discounting Goes Global’ The Maguire Network, Winter 2007 accessed on 5 June 2008 at http://www.maguireassoc.com/resource/maguire_network_ winter2007/newsletter.html Siefert, L. and Galloway, F. (2006) ‘A new look at solving the undergraduate yield problem: the importance of estimating individual price sensitivities’ College and University Journal 81 11-17. References IR Conference 25 June 2008

  38. Dr Helena Lim, Dr Rhod Davies and Dr Steve Jackson Southampton Solent University East Park Terrace Southampton SO14 0YN helena.lim@solent.ac.uk rhodri.davies@solent.ac.uk steven.jackson@solent.ac.uk Please direct further discussion & questions to: IR Conference 25 June 2008

More Related