280 likes | 411 Views
Building the CRS Online Community. “Beyond RFM” February 2005 DMFA Roundtable Kevin Whorton, Direct Response Fundraising Consultant Catholic Relief Services kwhorton@comcast.net. Test #1 Email Campaign. February 25, 2005. Modeling: Theory and Reality. Theory: RFM Has Weaknesses
E N D
Building the CRS Online Community • “Beyond RFM” • February 2005 DMFA Roundtable • Kevin Whorton, Direct Response Fundraising ConsultantCatholic Relief Services • kwhorton@comcast.net Test #1 Email Campaign February 25, 2005
Modeling: Theory and Reality • Theory: RFM Has Weaknesses • Limited use of information: gift history only • Omits demographics, psychographics • Mostly provides decision support for marginal audiences • No prioritization: R<F<M? … M>R=F? … M=R=F? • Uses language of discrete, not continuous variables • Reality: RFM Works Well Enough Most Times • House file mailings—very strong, long histories • House file telemarketing • Could be improved but little incentive to do so: • Can only be so efficient on mailings • Beyond some point minimizing cost may minimize revenue
Applying Techniques at CRS • House File Model Use • Target Analysis Group: affinity/other gift behavior • Powerful to screen the 50% waste, including lapsed in acquisition now outperforms a dedicated lapsed campaign • Genalytics: full-file scoring by half-decile • Full house file, by future probability of giving • Acquisition Model • Selection criteria used during list selection • Zip models and “Catholic Finder” • Full acquisition model • Created household database from 45 million past contacts • File scoring after merge purge: typical 20% suppression
Expanding Demographic Data • Distinguishing between donors: marketing vs. DM • Profiling new donors: 62 years avg vs. “youth movement” • Drawing linkage between awareness and donation • Understanding relationship: first gift ongoing behavior • We now use data to categorize donors • By appeal: emergency, region, program area • By vehicle: catalog, calendar, newsletter, TM, e- • By timing: seasonality • By preferences: limited mailing, no mail, no TM • Especially critical, post-Tsunami • Data used to drive frequency • Segmenting beyond RFM, going deeper into files • Often based on Interest Codes (next slide)
Example: Interest CodesUsed for Inclusions/Exclusions Entire file • Coded with a mix of Donor Service & DM codes • Simplify our house file selection • Behavior captured to:- simplify ad hoc analysis- extend RFM- develop profiles- crosstab “donor types”
Other (Non-Modeling) Data • Simulations: gift arrays • Demographic overlays beyond DM: mid-level PG, MG • Age & wealth trump typical RFM giving behavior • Mail sensitivity analysis • Finding correlation between total mailings, gifts per donor • Goal: maximize satisfaction without sacrificing revenue • Maintaining "interest codes" library of preferences • Merge-purge with greater control • Moved internally, staff analyst & FirstLogic software • Conversion analysis • List life-cycle: tables showing LTV (2-year) by acq. list • Target Analysis: benchmarking/comparisons
Other Data: Research • Donor research • Analyzing share of market/share of wallet • Knowing what else donors give to • Qualitative/focus groups • Package/teaser/copy testing • Underlying motivations/drivers/perceptions • Market research • Measuring aided/unaided recall, aficionados • Cluster models (segmentation studies) • Positioning studies (branding, relative message) • Competitive intelligence
Limitations: Analyzing Results • Most segmentation build to drive reporting • Pledgemaker report writer • Occasional use of Business Objects/SAS for ad hoc • Most segmentation is by discrete RFM buckets • Segmentation continues in the "normal way"$25-$49, 0-12 months, F1+ $50-$99, 0-12 months, F1+$100-$249, 0-12 months, F1+ • Extending universe based on interest codes • Applying excludes • Record types (PG, Corp, Spanish-language, Religious Orders) • Individual preferences (1, 2, 6, 12x preferred mail schedules) • Mutual omits from overlapping camapigns
Best Intentions: Other Applications • Original goal in 2003: "family of models" • Telemarketing • Early warnings of defection • Lapsed donors • Upgrade potential: mid-level program • Reasons for using: • High cost per contact/good stewardship • Sensitivity to complaints • Predict positive and negative outcomes • Complaints seen as proxy for reduced lifetime value • Reasons not pursued • Not a $$ limitation, but rather management time
Goal/Vision • Want to be more "donor focused" • Finding constructive ways to avoid treating all donors the same • RFM often treats as identical: • $500 donor, every year, 1 gift very end of year • $500 cumulative donor, monthly frequency • $500 first-time donor • Goal: sufficiently flexible systems to tailor contact sequence • Hard to implement CRM systems to reduce costs/maximize efficiency & donor satisfaction
Sample: Donor-focused Grid Use the gift they give to this appeal Consider lifetime seasonal giving activity
Sample Analysis: Years on File • Graphing non-linear relationships: finding “sweet spots”
Analysis: Lifetime Avg. Gift • And knowing when the relationships really are linear/predictive.
Guide to Models • Three major families: • Parametric Methods • Linear regression, logistic regressions • Recursive Partitioning methods (i.e. CHAID) • Tree diagrams—easier to see interaction between variables. Most time consuming. • Non-parametric methods • Neural networks, genetic/natural selection algorithms • Artificial intelligence—"learning models" used at CRS • Results are far more important • Results: more a function of data quality than technique Source: Target Analysis Group: Jason Robbins, statisticians
Sophisticated Techniques, Simple Answers Cross-tabulations • Shows simple relationships between variables, typically percentages • "Grids" allow easy audience selection, but complex to review Correlation: relationships between two variables Regression: • X=f(x,y,z) or Membership=function of dues level, presence of competition, penetration, service mix • R2 “explains” relationship between one variable and everything driving it • Projections and forecast models • Logistic regressions: “yes/no” predictions • Logarithmic: coefficients=percentage contribution • Dummy variables: use to measure seasonality, time trends, effects of one-time shifts
Introducing Linear Regression • Linear regression defined • PR=aR+bF+cM+dO • In English, “predicted revenue is a function of donor’s recency of giving, frequency, agg value, other stuff" • Model for a renewal program: with avg response rate 4.25%, avg gift $36.25, revenue/name mailed of $1.54: 1.54=-0.068(6.5) + 0.215(2.4) + 0.00465(156) + 0.0087(85) Confusing, but potential "Holy Grail" tool for your house file program
More Sense from Regressions • Confusing exposition: briefly assume you know what this means! • Alternative functional forms tell you more • For example: logarithmic transformations of each independent variable (R, F, M, Wealth) put them on equal "dimensions" • Average values will no longer make sense, but coefficients will! • In last equation: 0.182 Months Since 0.215 Total Gifts0.300 Aggegate Gifts 0.305 Indexed WealthMeans each value represents percentage contribution to results!! • Note on last slide, many combinations of specific values would add to the average revenue per donor • The formula "predicts" it, because it represents the "best fit" expressing relationship between the dependent and independent variables • This is an overly simple equation: it assumes only RFM plus wealth • Often there are other hidden values that also influence • Equation level metrics (R-squared) and variable-level (t ratios) tell you the degree of prediction and statistical significance
What You Should Know as a User • When these techniques are used … • Generally statistical software runs these: SAS at CRS • Fast process: takes less time to run than to explain • Key: some staff need to understand what the results mean • Younger staff are better, esp. if exposed to it in college—"data kids" • Once a formula is derived, the real output is a scored file • "Plotting the residuals" means taking best fit, multiplying through • Output can be indexed/scored according to predicted Rev/M etc. This typically falls on a curve, with an index ranging from 0-99th percentile of predicted revenue per name mailed
Before: List Effectiveness • Targeting based on list effectiveness • Focused on “finding more lists like these” Campaign 1 Campaign 2
New Approach • New analytic system to drive programs • Build prospect universe of likely responders • Overlay with demographic and census data • Catalog interaction over time by person • Develop insights over time with modeling • Select/suppress based on predicted behavior
After: Prospect Behavior • Targeting based on prospect behavior • Focus on “finding more people like this” Census & Specialty Demographics List & Campaign Attributes Marketing History + +
External Demographics Data Campaign Data Prospect Universe Focused Lists Matchcode and Geography Prospect Lists Preparation • Develop infrastructure • Collect and organize data • Response behavior retained • Other available information added
Equation Equation ƒ(x)= ƒ(x)= + + * * Applying Analytics to Discover Patterns Structured Data Model Ready Data Proliferation of Models Actionable Results Prospect Universe Suppression List
ƒ(x)= + * The Final Solution Sample Scoring Equation Acquisition Promotions Donations Data Mart Census Demographics Suppress Catholic Demographics To Mail Production Mailing Universe Suppressed Mailing Universe
Results/Benefits • Focused models on top segments rather than entire universe • Suppressed mailing to bottom of prospect universe • Discovered significant numbers of new prospects similar to existing donors • Savings more than paid for entire analytics program by: • Removing bottom portion of prospect universe that provides negative ROI • Providing greater understanding of and insight into characteristics of prospects and donors