1 / 27

Analytical Model Development & Implementation Experience from the Field

Analytical Model Development & Implementation Experience from the Field. Bhavani Raskutti. Topics to be covered. Model development & implementation process Case Study 1: Corporate Customer Modelling at Telcos Case Study 2: Sales Opportunities for wholesalers Take-Home Points.

mingan
Download Presentation

Analytical Model Development & Implementation Experience from the Field

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analytical Model Development & ImplementationExperience from the Field Bhavani Raskutti

  2. Topics to be covered • Model development & implementation process • Case Study 1:Corporate Customer Modelling at Telcos • Case Study 2: Sales Opportunities for wholesalers • Take-Home Points

  3. Model Development & Implementation Process Solution enabling business to make strategic & operational decisions • Deployment Data Matrix • Model Validation • Data • Acquisition & Preparation • Presentation • DAP • Analytical • Problem • Definition • Mathematical • Modelling • (Algorithms) • Decision-making by users • Insights via GUI • Automation • Training • Documentation • IT Support • Business Problem • Model Development • Iterative • 90% DAP • APD • MM • MV • P • D

  4. Topics to be covered • Model development & implementation process • Case Study 1:Corporate Customer Modelling at Telcos • Case Study 2: Sales Opportunities for wholesalers • Take-Home Points

  5. Business Problem • Large drops in margins & revenue in corporate customer base • Partial churn of some corporate customers to other telcos • Lack of understanding of customer’s needs • Project will target revenue improvement opportunities with an indicative $15 million in sales by: • undertaking a rapid analysis of Customer data from core systems, including front of house, customer satisfaction and marketing for customers with a spend greater than $100k, excluding state and local government • outcomes are to be validated using artificial intelligence tools and rigorous methodology by … Verbatim from client’s presentation to stake holders • Using data analysis, increase revenue from corporate customers whose spend is > $100k

  6. 1. Analytical Problem Definition • Using data analysis, increase revenue from corporate customers whose spend is > $100k • Increase revenue from corporate customers by • Win-back (database look-up)? • churn reduction? • Up-sell/cross-sell to an existing customer? • Customer data • Relationship with customer • Customer satisfaction survey data • Service assurance data (customer complaints) • Demographic information about business customer • Industry segment information • Number of sites • Revenue from customer • Quarterly revenue from different products • Create models to predict up-sell based on revenue data 1. Analytical Problem Definition

  7. 2. Data Acquisition & Processing • Using revenue data, create models to predict customers likely to take up a specific product • Population: • Customers in a segment who currentlydo not have the product being modelled • Target or positive case definition: • Customers in the segment who take up the product within a time period • Predictors for modelling 2. Data Acquisition & Processing

  8. Population and Target Definition • Let riP be the revenue from a customer on product P in billing period i • Population in periodiincludes all customers with r(i-1)P = 0 • Target or Product take-up in periodiiffr(i-1)P=0 and riP>TUMIN • TUMIN > 0 is the minimum take-up amount determined by the business TRAIN: r(i-1)P = 0 i-1 i Predictors Labels i i+1 Predict for riP = 0 2. Data Acquisition & Processing

  9. Low take-up rates: not enough targets • Average number of take-ups for any product in any period is small • Large businesses • Less than 20 take-ups in a period for 70 of the 100+ products • Less than 10 take-ups for 45 products • Medium businesses • Less than 20 take-ups for 71 products • Less than 10 take-ups for 60 products • Reasons • “niche” products • Saturated products 2. Data Acquisition & Processing

  10. Low take-up rates (Cont’d) Minimum take-ups(n) for modelling Minimum take-ups(n) for modelling • Aggregate data over multiple billing periods k • Producttake-up in periods i to i+k-1iffr(i-j)P=0 for j=1..k and j=0..k-1r(i+j)P >(kTUMIN)) Impact of data aggregation i-3 i-2 i-1 i TRAIN target: r(i-j)P = 0, j = 0..1 i-1 i i+1 i+2 Labels Predictors Predict if r(i+j)P = 0 or 1; j = 1..2 k=2 is useful 2. Data Acquisition & Processing

  11. Low take-up rates (cont’d) • Use of time interleaving • Aggregate data with k=2 • Generate 3 sets of data moved forward by a period • Concatenate the 3 sets to get 3 times as much training data as for data aggregation with k=2 Impact of time interleaving i-4 i-3 i-2 i-1 i-3 i-2 i-1 i Labels T R A I N i-5 i-4 i-3 i-2 • Time interleaving enormously enhances modellability i-1 i i+1 i+2 Prediction Predictors 2. Data Acquisition & Processing

  12. Predictors for Modelling Labels Predictors TRAIN target: r(i-j)P = 0, j = 0..1 • Revenue predictors used • r(i-3)Q – revenue for all products in billing period i-3 • Change in revenue from period i-3to i-2, r(i-3)Q - r(i-2)Q • Projected revenue for periodi-1, 2r(i-3)Q - r(i-2)Q • All revenue predictors used both as raw values, and normalised by total customer revenue • Binary predictors indicating churn/take-up in period i-2 • All continuous predictors converted to binary using 10 equisize bins • Overcomes the negative impact of large variance in revenues • Allows generation of non-linear models using linear techniques i-3 i-2 i-1 i 2. Data Acquisition & Processing

  13. 3. Mathematical Modelling • Imbalance in class sizes • Large businesses • 51 products with < 0.5% take-up on average • 25 products with < 0.1% take-up • Medium businesses • 74 products with < 0.5% take-up on average • 54 products with < 0.1% take-up • Maximisation of total take-up revenue • Identifying new high value customers is a priority • Extent of variance • Take-up amounts range from TUMIN to over a million dollars • Take-up amounts are not correlated with total revenue in previous billing periods 3. Mathematical Modelling

  14. Imbalance in class sizes • m+ and m- : number of +ve and -ve examples • C+ and C- : weight of +ve and -ve examples • Use of Support Vector Machines (SVMs) instead of decision trees, neural nets or logistic regression • Based on Vapnik’s statistical learning theory • Maximises the margin of separation between two classes • Two different SVM implementations • SVMstd : equal weight to all training examples • SVMbal : class dependent weights so all take-ups have a higher weight than all non-take-ups 3. Mathematical Modelling

  15. Identifying high value take-up • m+ and m- : number of +ve and -ve examples • C- : weight of -ve examples • TU(i): Take-up amount of the ith +ve example • C+(i): weight of the ith +ve example • SVMval: SVM with different weights for different positive (take-up) training examples • All take-up examples have a higher weight than all the non-take-up examples (as for SVMbal) • Each take-up training example has a weight proportional to the amount of take-up 3. Mathematical Modelling

  16. 4. Model Validation • Model assessment • Two tests for assessing quality of models (~4,000 models) • 10-fold cross validation tests to determine the best of the 3 SVMs • Tests in production setting to evaluate time interleaving • All tests on 30 product take-up prediction problems in 4 segments • Performance measures on unseen test set • Area under receiver operating characteristic curve (AUC) • Measures quality of sorting • Decision threshold independent metric • Value weighted AUC (VAUC) • Indicates potential revenue from the sorting • SVMval with time interleaved data is used for generating models • SVMval significantly more accurate than the other two • Time interleaving produces more stable models 4. Model Validation

  17. Model Validation by Business • Predictive models identify more sales opportunities than that identified manually • 3 times as many in large businesses segment • 5 times as many in medium businesses segment • Results for 2 different regions in medium businesses • Region 1: Predictions for just 5 products generated 9 new opportunities with an increase in revenue of ~400K A$ • Region 2: Predictions identified opportunities that were already being processed by sales consultants • Predictive modelling spreads the techniques of good sales teams across the whole organisation 4. Model Validation

  18. 5. Presentation • Output in Excel Spread Sheet automatically generated • One customer list per segment with: • Take-up likelihood for all modelled products • Last quarter revenue for all products 5. Presentation

  19. 6. Deployment • Implementation in Matlab & C with output in Excel • Automatic quarterly updates of model after consolidated revenue figures are available • Models for ~50 products for each of the 4 business segments • Output delivered to business analytics group • Different cut-offs for different products/regions • Superimposition of other data for filtering/sorting • Use of output by sales consultants for renegotiating contracts with customers 6. Deployment

  20. Project Timeline • Initial approach to data availability for pilot: 12 weeks • Data to pilot: 6 weeks • Model validation by business: 12 weeks • Pilot deployment (5 products, 1 segment): 6 weeks • Acceptance by business teams: over 9 months • Final deployment: 4 weeks • In operation for more than 8 years!! 6. Deployment

  21. Key Success Factors • Willingness of stake-holders to try non-standard solutions • Innovative solution: Paper published in KDD 2005 • Target definition using multiple overlapping time periods to boost the number of rare events for modelling • Use of support vector machines for customer analytics • Being lazy  • Scope change from 4 to 50 products • Scope change from 2 to 4 segments • Development of ~200 predictive models in one shot • No stale models in production • Working with business analysts to instigate change: • Product-centric modelling to customer-centric product packaging

  22. Topics to be covered • Model development & implementation process • Case Study 1:Corporate Customer Modelling at Telcos • Case Study 2: Sales Opportunities for wholesalers • Take-Home Points

  23. Case Study: Wholesale Sales • - Weekly SOH & sales for each store & SKU • - SKU master • - Store master • - Sales  demand • - Similar products @ similar outlets have similar demand to sales relationship • - Anomaly may be due to lack of stock • Simple univariate regression in SQL • MV • - Self-serve report for each sales rep • - Presents list of products with sales opportunities • - Click thru’ to detailed graphs • - Absolute error • - Validate with retail • Perform comparisons & find anomalies with stock issues • Increase wholesale sales into major retailers • DAP • - Quantify demand • - Define normalised sell-rate • - Define a long term in-stock measure • - Define products & outlets that are similar • APD • MM • P • D

  24. Case Study: Wholesale Sales (Cont’d) • R1 • R2 • Possible reasons for difference • Competing product at R2 • Pricing at R2 vs R1 • Lack of stock at R2 Sell Rate Demand In-stock % • Sell rate vsConsumer Demand plot • Each point is a store • R1 & R2 are comparable retailers • Values for the same product Demand

  25. Case Study: Wholesale Sales (Cont’d) • - Weekly SOH & sales for each store & SKU • - SKU master • - Store master • - Sales  demand • - Similar products @ similar outlets have similar demand to sales relationship • - Anomaly may be due to lack of stock • Simple univariate regression in SQL • MV • - SQL & Cognos • - Automatic weekly updates • - Training by corporate training team • - Support from IT helpdesk • - Self-serve report for each sales rep • - Presents list of products with sales opportunities • - Click thru’ to detailed graphs • - Absolute error • - Validate with retail • Perform comparisons & find anomalies with stock issues • Increase wholesale sales into major retailers • DAP • - Quantify demand • - Define normalised sell-rate • - Define a long term in-stock measure • - Define products & outlets that are similar • APD • MM • P • D

  26. Topics to be covered • Model development & implementation process • Case Study 1:Corporate Customer Modelling at Telcos • Case Study 2: Sales Opportunities for wholesalers • Take-Home Points

  27. Take-home points • Data acquisition & processing phase forms 80-90% of any analytics project • Business users are tool agnostic • R, SAS, Matlab, SPSS, … for statistical analysis • Tableau, Cognos, Excel, VB, … for presentation • Business adoption of analytics driven by • Utility of application • Validation of results by using real-life cases • Ease of decision-making from insights • Ability to explain insights

More Related