600 likes | 749 Views
Database Marketing. Dr. Ron Rymon Marketing Communications Program IDC, Herzliya. Overview. Goal: describe the framework, and touch on the current trends and buzzwords Outline: Uses of the marketing database The Data Implementation technologies Analysis techniques Modeling techniques.
E N D
Database Marketing Dr. Ron Rymon Marketing Communications Program IDC, Herzliya
Overview • Goal: describe the framework, and touch on the current trends and buzzwords • Outline: • Uses of the marketing database • The Data • Implementation technologies • Analysis techniques • Modeling techniques
The Marketing Database • Comprehensive collection of interrelated data ... • Arranged around each customer ... • Allow timely and accurate retrieval ... • Support analytical, predictive, operational needs ... • Serving multiple applications …
Active Database:An Integrated Business Resource Marketing Finance Customer Service Database Research Sales Distribution
Information is Power: Active databases drive the business… • Identify your best customers • profitability analysis, clustering • Develop new customers and cross-sell • similar to current, identify competitors’ customers • Improve delivery of sales promotion • response modeling / targeting • Personalize message • based on purchase patterns, volume • Use as a research tool across the organization • customer, product, market research
Key Building Blocks • Data • a database is only as powerful as its data • Implementation technologies • hardware, networking, warehouses • Analysis techniques • RFM, LTV, OLAP, Segmentation, Visualization • Modeling • Regressions, Artificial Intelligence, Data Mining
The Data A database is only as powerful as the data it houses.
Customer-centric Database • Every database is a collection of records • Each record is a collection of fields • Here, one record per customer and/or prospect • unique identifier • general customer/account information • including demographic, psychographic, socio-economic • our offers+communications to the customer • customer’s actions: response, purchase, payment
What Data To Hold • Too often, data is collected based on availability, and not based on projected need • Should accumulate internally • data that can be used to support current and future strategies (mktg and otherwise, e.g., operations) • …. data that may be valuable to other organizations • Should source external data • unavailable internally • too expensive to maintain/update
Data Sources : Internal • Operations / Sales • past usage/purchase, e.g., amount, variability • Finance • payments, e.g., timeliness, amounts • Customer service • e.g., inquiries, complaints • Other data collection methods: • sales/orders, promotions/drawings, inquiries, surveys, warranty cards, research panels...
Data Sources: Distribution Channels • Many companies use distributors, retailers • Problem: lack of direct communications with end-customer, no “relationship” • Part-solution 1: keep tabs on channel + aggregate statistics on customers • More aggressive solution: special marketing programs to reach customers
Data Sources : External Lists • 50% of U.S. DMers sell their lists • Use to enlarge universe : new names • can buy segments by specific features (model) • Enhance data : cross information • U.S. census data • Credit bureau • Various marketers of related products • List compilers / maintainers / sellers enhance data enlarge universe
Other Data Sources • Mass-advertised offers • TV call-ins, direct response • Joint offers with other merchants • take-one brochures in banks, restaurants • drawings • Trade shows, happenings, community activity • Referrals!
Data Management • Many sources: • conversions, transformations, cleaning, merge-purge • Many “clients” • marketing, sales, product managers, operations • Temporal issues • updates, audits, archives/deltas • Quantities: huge databases • Storage, access, processing, communications • Resolution, Enhancement
Merge-Purge : Example • Palmer, Robert and Mary, 123 Sun Avenue, Apt 7, Key West, FL 31250 • Dr. Robert C. Palmer, Custom Engineers LLC, 123 Sun Avenue, 7th Floor, Key West, FL 31250 • Rob Charles Palmer, CE Inc., 123 Sun Ave #7, Key West FL 31250 • Bob Palmer Jr., 123 Sun #7, Key West, FL 31252 • Maria Palmer, 123 Sun Avenue, Suite 7, Key West, FL 31250
Other Issues • Legal • Privacy Act • Anti-discrimination • Advertising Code • Telephone Consumer Protection Act • Consumer groups • Right to be omitted (just write to DMA) • Environmental issues • DMA invests in education: • Dmers: best practice • Customers: better image
Computing Platforms • Issues / Needs: • Information sources + integration • Storage/access, maintenance, completeness, update • Computation: process queries, algorithms • Analyses and reports, feed to operations, customer/user interaction • Trends: • Traditionally, all DMers used mainframes • Today, some migration to mid-range (UNIX) • PC-based computers gaining power (NT) • Client/Server architectures • Everything networked
Database Management • Database is a foundational software • Must support variety of applications: • transaction processing • analyses • on-line interaction • Trends: • Relational databases • Data warehouses • Data redundancy/multiplicity Applications Database O.S.
Relational Database Tables ID Cust Name Address … 1234 John Brown 123 Main St …. ID Date Cust Product Quant. Price 98765 3.5.98 1234 A703 5 150.00 98766 4.5.98 1234 A707 2 240.00 98767 4.5.98 1235 A703 1 30.00 ID Product Supplier ….. A703 Levis Jeans S7003 …. Reports (SQL Queries) Transactions for John Brown 3.5.98 5 Levis Jeans $150 4.5.98 2 CK Jacket $120 Purchases of Levis Jeans 3.5.98 John Brown 5 $150 4.5.98 Jane Doe 1 $30
Data Warehouse • Stores data for informational and analytical processing • Separate from operations • Subject-oriented • Integrated • Historical Operational Data Warehouse savings customer product loans investments credit card
Example: Computer-by-Mail Inc. House Files operations Mktg Executive mail Data Warehouse Analyst Client --- Server Telemarketers
Data Limitations • Important: The data is a limited encoding of reality • Many potholes: • Omission • Errors, noise • Representation • Sampling bias • Cannot be too careful !
Exploratory data analysis :Single-variable • Descriptive statistics • Mean, Median • Variance • Histograms • Shows distribution
Exploratory data analysis:Multi-variable • Examine relationship between two or more variables • Cross tabs • Correlation • Scatter plots • Clustered histograms
RFM Analysis • RFM score • Recency: how close is the last purchase • Frequency: number of recent purchases • Monetary: dollars spent recently • Example: • Recency: 10 pts if within 3 mos, 1 pt lower per additional month, to 1 • Frequency: 1 pt for each purchase within 12 months • Monetary: 1pt for each $100 in past year, to 10 • Score=R*F*M, the higher the better
Life-Time Value of Customers • LTV Goal: • recognize each customer’s contribution • Method: • calculate the “expected” net revenue • discounted: • risk of attrition • probability of sales • rate of money • Typically computed per 1000, if possible by segment
OLAP Tools • OLAP : On-Line Analytical Processing • Goal: A database-driven system that provides • Fast • Analysis • common business reports, statistics • of Shared • same information available to many users • Multi-dimensional • every piece of information is multiply categorized • Information “The OLAP Report”
OLAP Tools • Data represented internally as multi-dimensional cube • e.g., customer’s attributes, purchases, payments, etc. • User chooses presenting two dimensions at a time • e.g., show $-sales, by geographic region and income • Heavy use of hierarchical variables, with drilling capabilities: • time: year, quarter, month, week, day, hour • product: hardware, printers, small printers, PX-1000 • dollars: by ranges 0-1000, 1000-5000, 5000-25000, etc. • Analyses, highlights of interesting cases, etc.
Data Visualization Tools • Many relationships are best communicated visually: • histograms, pie-charts, scatter plots, graphs • use color/texture, shapes • temporal animations • Visualization software allows • single-variable over time • one variable as a function of another • interaction detection • segmentation
Modeling Behavior • Target variable • a.k.a. dependent/modeled/explained variable • typically, whether bought/responded or not • Goal: • Use other variables in a model to classify/predict • other variables: a.k.a. independent, observable, explaining • model: formula, algorithm • Success criterion: future performance
Modeling and Validation Framework • Data flow: • Historical data • Modeling software • Constructs model • Tested on more historical data • Repeat until satisfied • Use model to predict Training Set Model Production or Test Set Output
Critical Success Factors • Choice of data • scope: same/similar period, audience, offer, communication • explaining variables: available, useful, well-represented • Choice of modeling technique • appropriate for the goal • powerful: good fitting power • Careful and “pessimistic” testing and validation
Validation • NEVER test on same data set • avoid “memorizing” the data, overfitting • Out-of-sample methods • separate training set and test set • cross-validation, a.k.a. jack-knifing • remember temporal aspect • Evaluate the model’s robustness • estimate chance probability, bootstrapping
Classification v. Prediction Systems • Classification systems: • distinguish few types of customers, e.g., responded or not • technically, target variable is discrete/categorical • validation through “hit rate” • Prediction systems • predict probability of purchase, or purchase dollars • technically, target variable is continuous • validation through “closeness” measures
Linear Scoring Systems • Use linear regression • Coefficients evaluated using historical data • Higher score interpreted as greater likelihood of responding • Every coefficient measures “independent” contribution • Classification variant: discriminant analysis • e.g., predict response if score is above 0.3
Logistic Regressions • Logistic regression (logit) • Target variable • historical data: 0 or 1 • future application: used as probability • Independent variables: continuous or categorical • Probit: variation that relies on normal distribution
Presenting and Evaluating Results • Lift table
Presenting and Evaluating Results • Lift curve (a.k.a. Receiver Operating Characteristic curve)
Presenting and Evaluating Results • Confusion Matrix (given a specific threshold) • Accuracy=(Pr,r+Pn,n)/Total • Detection=Pr,r/(Pr,r+Pr,n) • Two error types: Pr,n and Pn,r
Break-even Analysis • Issue: how much to mail? • Solution: find break-even point • e.g.: • Caution: use held-out data!
Non-Linear Systems • In regressions, a change in one independent variables always affects in same direction • e.g., if age affects positively, then the older the better, always • One solution: transformations • e.g., if U-shaped relation, use quadratic form • Or, use non-linear techniques: • Neural networks • Decision trees • Other: rule-based systems, genetic algorithms, Bayesian nets
Neural Networks • Motivated by biological nervous system • Perceptron = a model of a neuron
Classical Neural Net • Multi-layer network of perceptrons • Proper weights are “discovered” from random • forward propagation of training set • compare output to actual target variable • back propagation of error to adapt weights
Decision Trees • Partition the data based on one attribute...