140 likes | 263 Views
Data Mining A pproach to Subscription Fraud Detection for AT&T Cards. Hyunsook Lee, Summer Intern Risk and Revenue Modeling Group, AT&T Labs Supervised by Colin Goodall. AT&T Proprietary. Objective : finding patterns in subscription fraud Contents Background Graphics Association Rules
E N D
Data Mining Approach to Subscription Fraud Detection for AT&T Cards Hyunsook Lee, Summer Intern Risk and Revenue Modeling Group, AT&T Labs Supervised by Colin Goodall AT&T Proprietary
Objective: finding patterns in subscription fraud Contents Background Graphics Association Rules Discussion AT&T Proprietary
Data mining • My definition : • finding patterns or systematic relationships • exploring data and TRANSFORMING them to indicators of interest • Graphical Analysis • Using DATA MINING TOOLS SAS Enterprise Miner AT&T Proprietary
Subscription Fraud Detection Analysis • What is Fraud Subscription? • Why the analysis is needed • How to do it? • Detecting subscription fraud from patterns of usage • High Usage : Thresholding, but not only that… • Other peculiar usage patterns : such as… • Understanding calling cards • Factors are possibly correlated • Design and create new signatures graphics and association rules will help AT&T Proprietary
Data Sets & properties data sets: FASC, CARM • FRAT : contains fraudulent info • FPD : 1st default payment data ( ):indicates business focus on FRAT data to find specific patterns of fraudsters AT&T Proprietary
Graphics.. # cards/(Paccount or BTN) AT&T Proprietary
Association Rules • What are Association Rules ? • customers’ item buying patterns • support : P(AB), confidence: P(A|B) • How do we apply? • analyze calls of each card and generate variables • Variable generation based on idea from graphics and thresholding AT&T Proprietary
Variable generation & logics • Possible characters of fraudulent cards • Many international calls • Zero Length calls, No Recorded calls • Many calls • Long duration, High rate • Peculiar usage after certain period(such as 1 month) • Satisfy $ based threshold, etc. AT&T Proprietary
Results from by SAS Enterprise Miner AT&T Proprietary
Frequency of items AT&T Proprietary
Items generated by usage patterns, 60% confidence AT&T Proprietary
Future work • Various approaches to generate Variables and Association Rules • Classification methods are challenges: TREE, Random Forest… AT&T Proprietary