1 / 0

Data mining for credit card fraud: A comparative study

Data mining for credit card fraud: A comparative study. Presenter : Cheng- Hui Chen Author : Siddhartha Bhattacharyya, Sanjeev Jha , Kurian Tharakunnel , J. Christopher Westland DSS 2010. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

abiba
Download Presentation

Data mining for credit card fraud: A comparative study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data mining for credit card fraud: A comparative study

    Presenter: Cheng-Hui Chen Author: Siddhartha Bhattacharyya, SanjeevJha, KurianTharakunnel, J. Christopher Westland DSS 2010
  2. Outlines Motivation Objectives Methodology Experiments Conclusions Comments
  3. Motivation Over the years, along with the evolution of fraud detection methods, perpetrators of fraud have also been evolving their fraud practices to avoid detection. While predictive models for credit card fraud detection are in active use in practice, reported studies on the use of data mining approaches for credit card fraud detection are relatively few.
  4. Objectives Credit cardfraud detection methods need constant innovation. weevaluate two advanced data mining approaches, support vectormachinesand random forests, together with the well-known logisticregression, as part of an attempt to better credit card fraud.
  5. Methodology Detection methods SVM Primary attributes Derived attributes Challenges Unbalanced class. Undetected fraud transactions, leading to mislabeled case. Random forests Logistic regression
  6. Methodology Credit card fraud Application fraud Fraudsters obtaining new cards from issuing companies using false information or other people's information. Behavioral fraud Mail theft Stolen/lost card Counterfeit card Card holder not present’ fraud
  7. Logistic regression Qualitative response models are appropriate when dependent variable is categorical. Our dependent variable fraud is binary, and logistic regression is a widely used technique. For example used binary choice models in the case of insurance frauds to predict the likelihood of a claim being fraudulent.
  8. Support vector machines Fraud Non-fraud K(X) = X+X2 SVMs are linear classifiers that work in a high-dimensionalfeature space that is a non-linear mapping of the input space of theproblem at hand. Properties Margin optimization SVMs minimize the risk of overfittingthe training data by determining the classification function (a hyper-plane) with maximal margin of separation between the two classes. Kernel trick It can represent the dot product of projections of two data points in a high-dimensional feature space. Using a kernel function
  9. Random forests GINI INDEX A random forest model is an ensemble of classification (or regression) trees.
  10. Experiments Datasets Primary attributes Derived attributes
  11. Experiments
  12. Experiments
  13. Experiments
  14. Conclusions A factor contributing to the performance of logistic regression ispossibly the carefully derived attributes used. SVM performance at the upper file depthstended to increase with lower proportion of fraud in the training data. Random forests demonstrated overall better performanceacross performance measures.
  15. Comments Advantages It’s write very detail Drawback … Applications Credit card fraud
More Related