Generating Syntetic Data for Credit Card Fraud Detection Using GANs
Authors: Strelcenia, E. and Prakoonwit, S.
Journal: 2022 International Conference on Computers and Artificial Intelligence Technologies, CAIT 2022
Pages: 42-47
DOI: 10.1109/CAIT56099.2022.10072179
Abstract:Deep learning-based classifiers for object classification and recognition have been utilized in various sectors. However according to research papers deep neural networks achieve better performance using balanced datasets than imbalanced ones. It's been observed that datasets are often imbalanced due to less fraud cases in production environments. Deep generative approaches, such as GANs have been applied as an efficient method to augment high-dimensional data. In this research study, the classifiers based on a Random Forest, Nearest Neighbor, Logistic Regression, MLP, Adaboost were trained utilizing our novel K-CGAN approach and compared using other oversampling approaches achieving higher F1 score performance metrics. Experiments demonstrate that the classifiers trained on the augmented set achieved far better performance than the same classifiers trained on the original data producing an effective fraud detection mechanism. Furthermore, this research demonstrates the problem with data imbalance and introduces a novel model that's able to generate high quality synthetic data.
https://eprints.bournemouth.ac.uk/38332/
Source: Scopus