1 / 14

Credit Card Fraudulent Transaction Detection

Credit Card Fraudulent Transaction Detection. As a part of CSC 219: Final Project Presentation Team Members : (Group #10) - Darshit Pandya - Sreeteja K. Guided By : - Dr. Meiliu Lu. Abstract.

Gabriel
Download Presentation

Credit Card Fraudulent Transaction Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Credit Card Fraudulent Transaction Detection As a part of CSC 219: Final Project Presentation Team Members: (Group #10) - Darshit Pandya - Sreeteja K. Guided By: - Dr. Meiliu Lu

  2. Abstract • Financial fraud is a developing threat with many consequences in the finance industries, corporate companies and government organizations. • From many criminal activities occurring in the financial industry, credit card fraudulent activities are the most prevalent. • It is important for the credit card companies to be able to detect the fraud transactions so that the customers won’t get charged for the items they did not purchase.

  3. Why important? • The credit card fraud detection becomes challenging for the following reasons: • The profiles of the genuine users and fraudulent behaviors change constantly. • Rate of online transactions have grown exponentially • The credit card fraud data sets are highly skewed. • Detecting fraudulent transactions using traditional method of manual detection is time consuming and inefficient Hence, it is necessary to develop a credit card fraud detection technique as a counter measure to fight illegal activities.

  4. What will we be doing? • In this term project, we will try to analyze 280k transactions with different attributes. (The name of the attributes are kept secret as to maintain the privacy of the user data) • Itinerary: • Analyze the correlation between attributes. • Analyze the effect of attributes’ values on target • Feature Engineering • Balancing/Sampling the skewed dataset • Application of the machine learning algorithms • Trying Deep Neural Nets • Comparing the models designed • Improvisation techniques

  5. How the data looks like? • The original data has 280K instances and 33 attributes • The Class attribute identifies transaction as Fraud[1] or Normal[0]. • The distribution of data is as: HIGHLY SKEWED - WE KNOW!!!!!

  6. Step 1:Data Visualization • The original data has 225k instances and 33 attributes • In this step, we have tried to visualize in the data by finding • Cor-relation • Target Value Impact • Distribution of the attribute values • Density distribution • Outliers visualization

  7. Step 2: Data preprocessing • For data-preprocessing step, we performed • Missing Values check and removal if any • Remove unnecessary features • Remove the outliers • Scale the values of attributes like Time and Amount

  8. Step 3: Application of Naïve-ML Algorithms • Considering the unbalanced dataset, we will try to apply the naive machine learning algorithms like • Logistic Regression • K-Nearest Neighbors • Support Vector Machine • Decision Tree • Random Forest • GridSearchCV For model evaluation, we will try to evaluate the model using Confusion Matrix and F1-Score

  9. Step 4: Deep Neural Networks • We have tried to use a Dense Neural Network which is originally titled as 'Artificial Neural Network' using Keras Framework. • For this approach, we have only used unbalanced dataset. • We have only used the Dense Layers in our approach by applying several optimizers like Adam and varying number of neurons in the complete layer.

  10. Step 5: Data Balancing • As the data is unbalanced, the predictions are tend to be biased • Naïve Machine Learning Algorithms are tend to get impacted by skewed data • For doing random sampling, below equation has been implemented. • value_count=Minimum Dist Value+((total_count_cat/minumum_dist_value∗2)−2)

  11. Step 5: Contd… • The balanced dataset looks as below:

  12. Project Demo • We will demo a notebook created on Google Colab with all minute details implemented in the project. • Let’s GO!

  13. Conclusion • Data Imbalance can cause bias in the prediction • SVM, K-NN and Random Forest performs comparatively better • The Best F1-Score(the parameter of evaluation) was received using Random Forest 0.99 • Applying Dense Neural Network to the dataset will help in case of unbalanced dataset too • Applying data sampling techniques can help to remove the bias in the prediction.

  14. THANK YOU!

More Related