220 likes | 378 Views
Partnership between research and industry for developing innovative data mining applications. Bhavani Raskutti Analytic CRM Westpac. Tenet. Need strong and equal partnership between industry and research to develop and implement innovative data mining solutions to solve real world problems
E N D
Partnership between research and industry for developing innovative data mining applications Bhavani Raskutti Analytic CRM Westpac
Tenet • Need strong and equal partnership between industry and research to develop and implement innovative data mining solutions to solve real world problems • Research: data mining solution provider • Research arm of the business • External research provider • university, consultancy, software vendor • Industry: data mining solution user • Business unit using analytics solutions • University departments • biology departments using analytics solutions
Argument Pathway • Two “successful” projects • Text mining • CRM for business customers • What does “success” mean? • Innovative solution • Implementation in business • Discussion around • How the project started • The evolution of the project • Key success factors • Role of industry & research for successful collaborations
Text Mining September 2000 - August 2005
Conception Text mining A file of in-bound SMS (from customers to Telstra) as a result of an out-bound SMS from Telstra Over 700 entries in one day Lots of abbreviations, acronyms and phonetic spelling Use of space and punctuation was haphazard Why research laboratories? Failure of standard techniques such as keyword matching in Excel Ongoing relationship for customer retention modelling Analysis was not high priority, so did not want to spend money to perform external manual analysis Open ended brief to identify “actionable” themes No timeframes No directions regarding what is meant by actionable themes
Evolution Demonstrated feasibility with output created by adapting in-house text clustering and summarisation tools to deal with SMS data and some manual processing in Excel 30 themes including a catch-all containing around ~50 entries Feedback from client after seeing the output Clusters need to refer to actionable themes Want to do this cluster creation once-only Demonstrated prototype of text analysis tool on unix with all text analysis executables and a cluster-editor in Java Positive feedback from business Research group invested in development of a stand-alone PC tool Text mining
Evolution (Cont’d) Text mining
Evolution (Cont’d) Text mining
Evolution (Cont’d) Business decided to restrict SMS for compliance messages only No need for automated inbound SMS analysis tool Exploration of other applications for the tool Identification of major issues from customer complaints data Analysis of verbatim comments in customer satisfaction surveys Analysis of verbatim comments in employee opinion surveys Use of text classification part in KDD cup win Part of demonstration package showcasing research credentials of Telstra Final application implemented in business in August 2005 for customer relationship management for SME customers Business invested time is usability analysis of tool, and in training their business analysts Research invested in changes identified by the usability test Text mining intellectual property (patents, code, etc.) licensed to an external company for commercialisation Text mining
Summary Text mining • Feature extraction to counter “dirty” text • Automatic selection of number of clusters based on cohesion • Use of patented support vector machine for text classification • Interactive editing to fine-tune automatically generated groups Open Ended Brief Feasibility / Vision Feedback Prototype Feedback System Deployment Hand-over Innovation Updated System Find Another Partner Usability Analysis • Ad-hoc analysis of customer complaints • Ad-hoc analysis of verbatim comments in surveys • Business analysts using the tool for CRM • IP licensed externally Implementation
Success Factors Both parties wanted the project to happen Industry: Needed to process the text data quickly Research: An interesting problem to apply text mining techniques Collaboration process Open-ended brief Regular sharing of knowledge and ideas Time for research ideas to be developed, piloted and implemented Multi-disciplinary team with researchers, programmers, psychologists and business users Contributions from both parties Industry: Data and interesting problem Implementation support Research: Innovative solution Investment Selling idea into business Continuity Text mining
CRM for business customers March 2003 - October 2004
Conception CRM for business customers Strong competitive pressures in the corporate customer segment Large drops in margins for the first time Needed to look for innovative methods to change the trend Belief in the utility of customer analytics Why research laboratories? Failure of traditional analysis approaches Availability of internationally recognised expertise Ability to provide objective solution Willingness to work with analysts in business Open ended brief to increase overall value across the corporate customer base by using data mining Aggressive targets Short timeframes
Evolution CRM for business customers Discussions with stakeholders to determine: Upsell, win-back or customer retention? Upsell What data should be used for mining Quarterly revenue for 100 products The segments to focus on Medium & large businesses The specific products to focus on Brief was to create upsell models using revenue data -- Straight-forward CRM problem? High imbalance in class sizes Average number of take-ups for any product in a quarter is small Raw accuracy is not most important – need to identify high value take-ups even at the cost of missing many low value take-ups
Evolution (Cont’d) CRM for business customers A prototype to produce 5 prioritised customer lists for each segment Generalised so it could be used to model any product Models could be re-built every quarter Models tested on time-independent hold-out set before releasing Testing of methodology and algorithms on over 30 products and 2 segments Validation of models by business analysts Comparison of sales opportunities identified manually vs by models Pilot for medium businesses -- 2 non-metro regions in different states Region 1: Predictions identified opportunities that were already being processed by sales consultants Region 2: Predictions for just 5 products generated 9 new opportunities with an increase in revenue of ~400K A$ Predictive modelling spreads the techniques of good sales teams across the whole organisation
Evolution (Cont’d) CRM for business customers Business expanded the scope of application to include More segments Rural corporate customer segment All other corporate customers in 3 segments: Large, Medium & Small More products Chosen by business analytics group Different for different segments Rebuild of models every quarter to avoid staleness Mechanism of delivery of outputs to sales consultants Automated delivery of leads directly into front-end CRM systems along with supporting data to facilitate the sales Delivery of prioritised customer lists to business analysts who then superimpose other business rules and create a set of leads
Evolution (Cont’d) CRM for business customers Initial implementation used for 4 quarters 4 segments covering all corporate customers ~50 products for each segment Quarterly re-build of models and generation of scores by research Prioritised customer lists one per product per segment delivered to business analysts Model explanation provided through weighted rules Sales consultants receive a list of leads for products Final implementation in business System handed over to business analysts for model maintenance and scoring No model explanation generated Customer-centric lists suggesting a list of products per customer
Summary CRM for business customers Innovation • Techniques to boost the number of rare events for modelling • Use of support vector machines for learning from unbalanced data • Techniques to boost influence of high take-ups in training • Use of value-weighted metrics to choose correct algorithm Open Ended Brief Prototype Feedback Pilot Validation Score Generation Deployment Hand-over To BA • Research solution implemented in much wider context • Scope change from 4 to 50 products • Scope change from 2 to 4 segments • Models updated quarterly, so no stale models in production • Customer-centric lists was beyond the original brief Implementation Note: Published in “Predicting Product Purchase Patterns for Corporate Customers” by Bhavani Raskutti & Alan Herschtal in Proceedings of KDD’05, Chicago, Illinois, USA
Success Factors Both parties wanted the project to happen Industry: Increase in sales Research: Funding and relationship with that part of business Collaboration process: Open-ended brief Regular sharing of knowledge and ideas Multi-disciplinary team with researchers and business analysts Willingness of stake-holders to try non-standard solutions and instigate change in process Contributions from both parties Industry: Problem & Access to data Implementation support Investment Selling idea into business Continuity Research: Innovative solution CRM for business customers • Text Mining • Industry: • Data and interesting problem • Implementation support • Research: • Innovative solution • Investment • Selling idea into business • Continuity
Summary Research Business • Approach research only if problem is not amenable to traditional solutions • Support research group with all necessary resources • Essential: Data, Feedback, Usability testing, … • Provide a free hand, however, be involved at all times • Begin collaboration only if you need more than just money from industry • Set up a collaborative process to ensure time commitment from business • Set your own research agenda, however, keep communication lines open • Think beyond the current project and build relationships Both parties should want the project Strong collaboration: respect and sharing Dynamic multi-disciplinary team Continuity Investment Selling Idea into Business Data and Problem Implementation Support Innovative Solution
Thank You Any Questions?