
Detection of fraudulent credit card transactions: A comparative analysis of data sampling and classification techniques
Author(s) -
Konduri Praveen Mahesh,
Shaik Ashar Afrouz,
Anu Shaju Areeckal
Publication year - 2022
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2161/1/012072
Subject(s) - credit card fraud , oversampling , computer science , credit card , support vector machine , random forest , machine learning , artificial intelligence , sampling (signal processing) , data mining , logistic regression , world wide web , bandwidth (computing) , payment , computer vision , computer network , filter (signal processing)
Every year there is an increasing loss of a huge amount of money due to fraudulent credit card transactions. Recently there is a focus on using machine learning algorithms to identify fraud transactions. The number of fraud cases to non-fraud transactions is very low. This creates a skewed or unbalanced data, which poses a challenge to training the machine learning models. The availability of a public dataset for this research problem is scarce. The dataset used for this work is obtained from Kaggle. In this paper, we explore different sampling techniques such as under-sampling, Synthetic Minority Oversampling Technique (SMOTE) and SMOTE-Tomek, to work on the unbalanced data. Classification models, such as k-Nearest Neighbour (KNN), logistic regression, random forest and Support Vector Machine (SVM), are trained on the sampled data to detect fraudulent credit card transactions. The performance of the various machine learning approaches are evaluated for its precision, recall and F1-score. The classification results obtained is promising and can be used for credit card fraud detection.