Adaptive Fraud Detection

Adaptive Fraud Detection by Tom Fawcett and Foster Provost Presented by: Yunfei Zhao (updated from last year’s presentation by Adam Boyer)

Outline • Problem Description • Cellular cloning fraud problem • Why it is important • Current strategies • Construction of Fraud Detector • Framework • Rule learning, Monitor construction, Evidence combination • Experiments and Evaluation • Data used in this study • Data preprocessing • Comparative results • Conclusion • Exam Questions

The Problem

Cellular Fraud - Cloning • Cloning Fraud • A kind of Superimposition fraud. • Fraudulent usage is superimposed upon ( added to ) the legitimate usage of an account. • Causes inconvenience to customers and great expense to cellular service providers. • Other Examples: Credit card fraud, Calling card fraud, some types of computer intrusion

Cellular communications and Cloning Fraud • Mobile Identification Number (MIN) and Electronic Serial Number (ESN) • Identify a specific account • Periodically transmitted unencrypted whenever phone is on • Cloning occurs when a customer’s MIN and ESN are programmed into a cellular phone not belonging to the customer.

Interest in reducing Cloning Fraud • Fraud is detrimental in several ways: • Fraudulent usage congests cell sites • Fraud incurs land-line usage charges • Cellular carriers must pay costs to other carriers for usage outside the home territory • Crediting process is costly to carrier and inconvenient to the customer

Strategies for dealing with cloning fraud • Pre-call Methods • Identify and block fraudulent calls as they are made • Validate the phone or its user when a call is placed • Post-call Methods • Identify fraud that has already occurred on an account so that further fraudulent usage can be blocked • Periodically analyze call data on each account to determine whether fraud has occurred.

Pre-call Methods • Personal Identification Number (PIN) • PIN cracking is possible with more sophisticated equipment. • RF Fingerprinting • Method of identifying phones by their unique transmission characteristics • Authentication • Reliable and secure private key encryption method. • Requires special hardware capability • An estimated 30 million non-authenticatable phones are in use in the US alone (in 1997)

Post-call Methods • Collision Detection • Analyze call data for temporally overlapping calls • Velocity Checking • Analyze the locations and times of consecutive calls • Disadvantage of the above methods • Usefulness depends upon a moderate level of legitimate activity

Another Post-call Method( Main focus of this paper ) • User Profiling • Analyze calling behavior to detect usage anomalies suggestive of fraud • Works well with low-usage customers • Good complement to collision and velocity checking because it covers cases the others might miss

Sample Frauded Account

The Need to be Adaptive • Patterns of fraud are dynamic – bandits constantly change their strategies in response to new detection techniques • Levels of fraud can change dramatically from month-to-month • Cost of missing fraud and dealing with false alarms change with inter-carrier contracts

Automatic Construction of Profiling Fraud Detectors

One Approach • Build a fraud detection system by classifying calls as being fraudulent or legitimate • However there are two problems that make simple classification techniques infeasible.

Problems with simple classification • Context • A call that would be unusual for one customer may be typical for another customer (For example, a call placed from Brooklyn is not unusual for a subscriber who lives there, but might be very strange for a Boston subscriber. ) • Granularity • At the level of the individual call, the variation in calling behavior is large, even for a particular user.

The Learning Problem • Which call features are important? • How should profiles be created? • When should alarms be raised?

Detector Constructor Framework

… # calls from BRONX at night exceeds daily threshold SUNDY airtime exceeds daily threshold Airtime from BRONX at night S >=q Use of a detector ( DC-1 ) Profiling Monitors 1 27 0 Value normalization and weighting Evidence Combining Yes FRAUD ALARM

Rule Learning – the 1st stage • Rule Generation • Rules are generated locally based on differences between fraudulent and normal behavior for each account • Rule Selection • Then they are combined in a rule selection step

Rule Generation • DC-1 uses the RL program to generate rules with certainty factors above user-defined threshold • For each Account, RL generates a “local” set of rules describing the fraud on that account. • Example: (Time-of-Day = Night) AND (Location = Bronx)  FRAUD Certainty Factor = 0.89

Rule Selection • Rule generation step typically yields tens of thousands of rules • If a rule is found in ( covers ) many accounts then it is probably worth using • Selection algorithm identifies a small set of general rules that cover the accounts • Resulting set of rules is used to construct specific monitors

Profiling Monitors – the 2nd stage Monitor has 2 distinct steps - • Profiling step: • Monitor is applied to an account’s non-fraud usage to measure account’s normal activity. • Statistics are saved with the account. • Use step: • Monitor processes a single account-day, references the normality measure from profiling and generates a numeric value describing how abnormal the current account-day is.

Most Common Monitor Templates • Threshold • Standard Deviation

Threshold Monitors

Standard Deviation Monitors

Example for Standard Deviation • Rule–(TIMEOFDAY = NIGHT) AND (LOCATION = BRONX) FRAUD • Profiling Step -the subscriber called from the Bronx an average of 5 minutes per night with a standard deviation of 2 minutes. At the end of the Profiling step, the monitor would store the values (5,2) with that account. • Use step - if the monitor processed a day containing 3 minutes of airtime from the Bronx at night, the monitor would emit a zero; if the monitor saw 15 minutes, it would emit (15 - 5)/2 = 5. This value denotes that the account is five standard deviations above its average (profiled) usage level.

Combining Evidence from the Monitors – the 3rd stage • Train a classifier with • attributes (monitor outputs) • class label (fraudulent or legitimate) • Weights the monitor outputs and learns a threshold on the sum to produce high confidence alarms • DC-1 uses Linear Threshold Unit (LTU) • Simple and fast • Feature selection • Choose a small set of useful monitors in the final detector

Data used in the study

Data Information • 4 months of call records from the New York City area. • Each call is described by 31 original attributes • Some derived attributes are added • Time-Of-Day • To-Payphone • Each call is given a class label of fraudulent or legitimate.

Data Cleaning • Eliminated credited calls made to numbers that are not in the created block • The destination number must be only called by the legitimate user. • Days with 1-4 minutes of fraudulent usage were discarded. • Call times were normalized to Greenwich Mean Time for chronological sorting

Data Selection • Once the monitors are created and accounts profiled, the system transforms raw call data into a series of account-days using the monitor outputs as features • Rule learning and selection • 879 accounts comprising over 500,000 calls • Profiling, training and testing • 3600 accounts that have at least 30 fraud-free days of usage before any fraudulent usage. • Initial 30 days of each account were used for profiling. • Remaining days were used to generate 96,000 account-days. • Distinct training and testing accounts ,10,000 account-days for training; 5000 for testing • 20% fraud days and 80% non-fraud days

Experiments and Evaluation

Output of DC-1 components • Rule learning: 3630 rules • Each covering at least two accounts • Rule selection: 99 rules • 2 monitor templates yielding 198 monitors • Final feature selection: 11 monitors

The Importance Of Error Cost • Classification accuracy is not sufficient to evaluate performance • Should take misclassification costs into account • Estimated Error Costs: • False positive(false alarm): $5 • False negative (letting a fraudulent account-day go undetected): $0.40 per minute of fraudulent air-time • Factoring in error costs requires second training pass by LTU

Alternative Detection Methods • Collisions + Velocities • Errors almost entirely due to false positives • High Usage – detect sudden large jump in account usage • Best Individual DC-1 Monitor • (Time-of-day = Evening) ==> Fraud • SOTA - State Of The Art • Incorporates 13 hand-crafted profiling methods • Best detectors identified in a previous study

DC-1 Vs. Alternatives

Shifting Fraud Distributions • Fraud detection system should adapt to shifting fraud distributions To illustrate the above point - • One non-adaptive DC-1 detector trained on a fixed distribution ( 80% non-fraud ) and tested against range of 75-99% non-fraud • Another DC-1 was allowed to adapt (re-train its LTU threshold) for each fraud distribution • Second detector was more cost effective than the first

DC-1 Component Contributions(1) • High Usage Detector • Profiles with respect to undifferentiated account usage • Comparison with DC-1 demonstrates the benefit of using rule learning • Best Individual DC-1 Monitor • Demonstrates the benefit of combining evidence from multiple monitors

DC-1 Component Contributions(2) • Call Classifier Detectors • Represent rule learning without the benefit of account context • Demonstrates value of DC-1’s rule generation step, which preserves account context • Shifting Fraud Distributions • Shows benefit of making evidence combination sensitive to fraud distribution

Conclusion • DC-1 uses a rulelearning program to uncover indicators of fraudulent behavior from a large database of customer transactions. • Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies. • Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate highconfidence alarms.

Conclusion • Adaptability to dynamic patterns of fraud can be achieved by generating fraud detection systems automatically from data, using data mining techniques • DC-1 can adapt to the changing conditions typical of fraud detection environments • Experiments indicate that DC-1 performs better than other methods for detecting fraud

Exam Questions

Question 1 • What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under? • Pre Call Methods • Involves validating the phone or its user when a call is placed. • Post Call Methods – DC1 falls here • Analyzes call data on each account to determine whether cloning fraud has occurred.

Question 2 • Three stages of DC1? • Rule learning and selection • Profiling monitors • Combine evidences from the monitors

Question 3 • Profiling monitors have two distinct stages associated with them. Describe them. • Profiling step: • The monitor is applied to a segment of an account’s typical (non-fraud) usage in order to measure the account’s normal activity. • Use step: • The monitor processes a single account-day at a time, references the normalcy measure from the profiling step and generates a numeric value describing how abnormal the current account-day is.

The End. Questions?

Adaptive Fraud Detection