330 likes | 455 Views
Modeling and Money: The Two DO Mix TAIR February 1, 2006. Baylor University . Located in Waco Texas Affiliated with Baptist General Convention of Texas Bachelors/Masters/Doctoral degrees Seminary Mdiv and Dmin Law Fall Enrollment approximately 14,000. Nuggets.
E N D
Baylor University • Located in Waco Texas • Affiliated with Baptist General Convention of Texas • Bachelors/Masters/Doctoral degrees • Seminary Mdiv and Dmin • Law • Fall Enrollment approximately 14,000
Nuggets “If you’ve got terabytes of data, and you’re relying on data mining to find interesting things in there for you, you’ve lost before you’ve even begun.” — Herb Edelstein
Predictive Modeling at BU • Enrollment Management Inquiry to Net Deposit Accept to Enroll • Applications of model • Moving from one stage to another • Classification of students-new freshmen, new transfers, graduate, etc. • Texas and non-Texas students
Enrollment Management Stages • Inquiry • Applied • Accepted • Deposit • Net Deposit • Enroll • Retention • Graduation
Student Retention • Applications -- • Fall to Spring Retention • Fall to Fall Retention • Enroll to Graduation
Donor Management • Annual Gift • Major Donor • Planned Gift • Retention/Upgrade • New Donors
Business Questions • How can we identify potential major donors? • How can we predict propensity of a donor to make an annual gift? • How can we identify potential planned giving donors? • How can we identify current donors that can move to next level of giving?
How can we identify non-donor constituents with characteristics of a donor? • How can we predict expected value of a gift?
Required Expertise • Domain • Data • Analytical Methods
Project Team • Representatives from University Development • Representatives from Institutional Research • SAS Consultants
Process/Steps • Explore Development data • Build datasets for descriptive models • Validate datasets • Create profiles for analysis • Build datasets for predictive modeling/mining • Mine the data • Create predictive models • Apply the models • Test the models
Data Exploration • New database for IR • Learn and learn more! • Edit reports and data cleansing
Profiles • Donor • Non-donor • Alumnae donor • Hispanic donor • African-American donor • More data cleansing!
Indicator Score • Creation of indicator variables with yes/no (1/0) values • For Single households -- 18 indicators • For Two-person households -- 25 indicators (7 indicators could be duplicated)
Indicator Variables • DOB_50_ind – over 50 years of age? • Married-Widowed_ind - married or widowed? • Children_ind – any info on children? • Alumni_ind – an alumni? • Contact_ind – any contact info for donor? • Executive_ind – executive job code?
Leader_ind –Baylor relationship? • gift count – has donor made 15 gifts over lifetime? • gift_5k – total cum gifts >= $5,000? • gift_25k – total cum gifts >= $25,000? • gift_100k – total cum gifts >= $100,000? • year5_ind – has donor made $250 gift in EACH of last 5 years?
year2_ind – has donor made ANY gift in EITHER of last 2 years? • Rating_ind – does donor have Echelon rating? • Athletic_gift_ind – has donor made gift to Athletic Department? • Alumn_assoc_ind – has donor made gift to Alumni Association? • Spouse_alum_ind - is spouse coded an alum?
Donor Household Profile • 64,000+ Households • 72% One donor in household • 50% Alumni • 60% Males • 57% Married • 19% indicate Baptist religion • 58% indicate Texas residences
Non-Donor Household Profile • 77,000+ Households • Most data fields have a large percent of missing values
Donor Model for 2004 • Use donors for previous 10 years • Create target variable • Identify predictor variables • Build model • Apply to 2005 donors
Categories of Predictors • Biographical/demographic - 20 • Contact information - 12 • Degree data – 9 • Activities - 15 • Gift information - 31 • External rating information - 5 • Research data - 4
Building Model • Target variable – gift in 2004 • 1 for household with 2004 donation • 0 for household with no donation in 2004 • Predictors constructed from donors in 1994-2003 time period • Tools -- SAS Enterprise Miner • Used to build, validate, and score
Model Comparisons • ROC curves and Lift charts indicate all models are performing well • Misclassification rates for the models are all close to 16% • Very little difference between average profit for the models • Logistic regression was chosen as the model to employ
Model Application • Analyze 2004 donors at the end of June 2005 • Determine those who have not made a donation • Use probability scores to target those most likely to make a gift
Future Work • Application of general model • Annual gifts • Major gifts • Planned gifts • Non-donor model • Gift amount model • Life time value model
Thanks! Questions or Comments