Applying Data Mining Technique to Direct Marketing

Applying Data Mining Technique to Direct Marketing Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department of Information Management National Yunlin University of Science and Technology

Outline • Motivation • Objective • Introduction • Background • The Generalized SOM • Experiments • Conclusions

Motivation • Firms with the huge amount of complex marketing data on hand, need to further analysis and expect to make more profits. • Clustering, a technique of data mining, is especially suitable for segmenting data. • However, firm’s database usually consist of mixed data (numeric and categorical data).

Objective • We utilize a new visualized clustering algorithm, the generalized self-organizing map (GSOM), to segment customer data for direct marketing. • Unlike conventional SOM, the GSOM can reasonably express the relatively distance of categorical values. • Then, we apply GSOM to direct marketing would generate more profits.

Introduction (1/5) • Marketing practices have shifted to customer-oriented from traditional mass marketing. • Firms usually perform market segmentation and devise different marketing strategies for different segments.

Introduction (2/5) • Data mining means a process of nontrivial extraction of implicit, previously unknown and potentially useful information from a huge amount of data. • Cluster analysis can assist marketers in identifying clusters of customers with similar characteristics.

Introduction (3/5) • The self-organizing map (SOM) network, proposed by Kohonen, is an useful visualized tool in data mining • Dimensionality reduction & Information visualization • Preserve the original topological relationship

Introduction (4/5) • The approach of the SOM in handling categorical data • It uses binary encoding that transforms categorical values to a set of binary values.

Introduction (5/5) • In this paper, we propose an extended SOM, named generalized SOM (GSOM), to overcome the drawback in handling categorical data • We construct the concept hierarchies for each categorical attributes.

(1) (2) (3) Background (1/2) • Self-organizing map, SOM • Find the winner (BMU) by (1) • Update the winner and neighborhood by (2)

Background (2/2) • Problems of the conventional SOM D(Coke, Pepsi) = D(Coke, Mocca) = D(Pepsi, Mocca)

SOM network ID Drink 1 Coke 2 Pepsi 3 Mocca Input pattern Any Juice Coffee Carbonated mq Orange Apple Latte Mocca Coke Pepsi x The Generalized SOM • We use concept hierarchies to help calculate the distances of categorical values • An input pattern and the GSOM vector are mapped to their associated concept hierarchies. • The distance between the input pattern and the GSOM vector is calculated by measuring the aggregated distance of mapping points in the hierarchies.

General concepts 0 1 2 1 1 1 1 1 1 1 1 1 Specific concepts Concept hierarchies (1/3) D(Coke, Pepsi) < D(Coke, Mocca) = D(Pepsi, Mocca)

Any mq=(Pepsi, 1.7) Juice Coffee Carbonated mq Orange Apple Latte Mocca Coke Pepsi Input pattern SOM network x Concept hierarchies (2/3) • A point X=(NX, dX) • NX: an anchor (leaf node) of point X • dX: a positive offset (distance) from X to root • Example: x=(Coke, 2.0); mq=(Pepsi, 1.7)

Any (4) Example: x=(Coke, 2.0); mq=(Pepsi, 1.7) Juice Coffee Carbonated (5) |x – mq | = 2 + 1.7 – 2×1 = 1.7 mq Orange Apple Latte Mocca Coke Pepsi x Concept hierarchies (3/3) duplication 0 1 2 red dx blue dmq

Experiments • Experiment dataset • Synthetic dataset consists of 6 groups of two categorical attributes, Department and Drink. • Real dataset Adult from the UCI repository • With 48,842 patterns of 15 attributes. • 8 categorical attributes, 6 numerical attributes, and 1 class attribute Salary. • 76% of the patterns have the value of ≤50K.

Experiments • Parameters were set according to the suggestion in the software package SOM_PAK. • Categorical values are transformed to binary values when we train the SOM. • While mixed data are used directly when we train the GSOM. Each link weight of concept hierarchies is set to 1.

Department Drink Synthetic dataset (1/2)

Binary SOM GSOM Synthetic dataset (2/2) • An 8×8 SOM network is used for the training. After 900 training iterations, the trained maps of SOM and GSOM under the same parameters are shown in below.

Real dataset (1/3) • We randomly draw 10,000 patterns which have 75.76% of ≤50K, similar to the Salary distribution of the original Adult dataset • Three categorical attributes, Marital-status, Relationship, and Education. • Four numeric attributes, Capital-gain, Capital-loss, Age, and Hours-per-week.

Relationship Marital-status Education Real dataset (2/3) • Concept hierarchies for the categorical attributes are constructed as shown in below.

Binary SOM GSOM Real dataset (3/3) • A 15×15 SOM network is used for the training. After 50,000 iterations, the trained maps of SOM and GSOM under the same parameters are shown in below.

Distributions of Salary attribute in each cluster

Application to Direct Marketing (1/2) • After we utilize the GSOM to perform data clustering, this segmented dataset can be further applied to catalog marketing. • Suppose that • The cost of mailing a catalog is $2. • The customers whose salaries are over 50K, we make an average profit of $10 per person. • Otherwise, we make an average profit of $1 per person.

$14,344 $7,505 Application to Direct Marketing (2/2)

Conclusions • In this paper, we propose a data clustering method • The GSOM extends the conventional SOM and overcomes its drawback in handling categorical data by utilizing concept hierarchies. • The experimental results confirmed that the GSOM can better reveal the cluster structure of data than the conventional SOM does. • We can make more profits by the marketing based on the segmentation results of the GSOM than by the marketing to the customers randomly drawn from the customer database.

Q & A

Applying Data Mining Technique to Direct Marketing