1 / 12

Mailing Campaign Model

Mailing Campaign Model. Nan Yang University of Central Florida 04/11/2008. Overview. Data Visualization Data Preparation Model Building Variable Selection Interaction Model Assessment ROC. Data Visualization. 63 Variables

casey
Download Presentation

Mailing Campaign Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008

  2. Overview • Data Visualization • Data Preparation • Model Building • Variable Selection • Interaction • Model Assessment • ROC

  3. Data Visualization • 63 Variables • Target is binary with 1 indicating people responded to the mailing campaign • Target is very unbalanced • Target rate is 1.13% for training set

  4. Data Visualization • Categorical Variable • High level variables • x2 ~ 57 levels • DATE variables (x10 & x11) ~ over 100 levels • Missing value • DATE variables ~ 30%-70% • Some variables missing value coded as “Unknown” or “Uncoded”, e.g x20

  5. Data Visualization • Interval Variable • Skewness

  6. Data Preparation • Missing Value Indicator (MVI) • Variables with > 5% missing • Binary • Capture the missing value information

  7. Data Preparation • Imputation • Unconditional imputation • Categorical variable • Tree/Tree Surrogate • Interval variable • Cluster

  8. Data Preparation • Transformation • Right skewed • Log or Square Root transformation • Left skewed • Square transformation

  9. Model Building • Variable selection • Individual predictive power • Logistic backward elimination • Keep the potential interaction terms • Logistic stepwise selection • Tree • Different criterions • 21 variables selected

  10. Model Building • Interactions • SAS EMiner Regression node • 11 interaction terms selected • Model • Ensemble different logistic models

  11. Model Assessment • AUC = 0.66

  12. Acknowledgement • UCF Statistics Dept • BlueCross BlueShield of FL

More Related