Machine Learning Approaches for Credit Card Fraud Detection‎in Severely Imbalanced Datasets: A Comparative Analysisof ‎Classification and Anomaly Detection Methods

  • Authors

    • Sezai Ph. D. MBA Tunca Faculty of Economics, Administrative, and Social Sciences, Alanya University, 07400, Alanya, Antalya, Turkiye
    • Yavuz Selim Ph. D. Balcioglu Management Information System Department, Faculty of Economics and Administrative Sciences, Dogus University, 34775, Dudullu, ‎Istanbul, Turkiye
    • Ceren Cubukcu Ph. D. Cerasi Management Information System Department, Faculty of Business, Gebze Technical University, Gebze, Kocaeli, Turkiye
    • Umit Ph. Dc.‎ Bayraktar Department of Business Administration, Faculty of Business, Gebze Technical University, Gebze, Kocaeli, Turkiye
    https://doi.org/10.14419/m6x6fn74

    Received date: July 31, 2025

    Accepted date: August 31, 2025

    Published date: September 16, 2025

  • Credit Card Fraud Detection; Machine Learning; Class Imbalance; Feature Importance; Threshold Optimization
  • Abstract

    Credit card fraud presents a persistent threat to financial institutions, exacerbated by the rise of digital payments and the complexity of ‎fraudulent schemes. This study investigates machine learning (ML) approaches for fraud detection in severely imbalanced datasets, focusing ‎on three key objectives: comparing classification and anomaly detection models under extreme class imbalance, identifying transaction ‎features with the highest discriminative power, and optimizing decision thresholds using cost-sensitive evaluation to minimize business ‎impact. Utilizing a dataset of 999 transactions with a fraud rate of 0.2% (498.5:1 imbalance), we implemented supervised methods (logistic ‎regression, random forest, gradient boosting) and unsupervised anomaly detection (Isolation Forest, One-Class SVM, Local Outlier ‎Factor). Results show that ensemble-based models, particularly Gradient Boosting, achieved superior performance (AUC-ROC = 0.956; ‎AUC-PR = 0.378) with perfect recall and improved precision relative to other methods. Feature analysis identified anonymized PCA-‎derived variables (V14, V10, V12) as the most discriminative indicators of fraudulent activity. Threshold optimization at 0.9 minimized ‎operational costs ($2,985) while maintaining full recall, yielding an estimated annual net benefit of $68,985 and a return on investment of ‎‎186.7%. This study contributes to the literature by integrating algorithm benchmarking, feature importance evaluation, and cost-sensitive ‎threshold optimization in an end-to-end fraud detection framework. The findings underscore the importance of ensemble learning, ‎imbalanced evaluation metrics (AUC-PR, precision, recall), and business-driven threshold calibration for developing effective and ‎economically viable fraud prevention systems. Future research should explore larger datasets, adaptive learning to address concept drift, and ‎explainable AI techniques to enhance interpretability and regulatory compliance‎.

  • References

    1. Alfaiz, A., & Fati, S. M. (2022). Handling class imbalance in credit card fraud detection: Comparative study of resampling techniques and cost-sensitive learning. Journal of Financial Crime Analytics, 14(2), 115–132.
    2. Breskuvienė, J., & Dzemyda, G. (2024). Emerging challenges of credit card fraud detection in digital finance. International Journal of Information Security and Privacy, 19(1), 45–63.
    3. Darwish, H., Elsayad, A., & Rizk, R. (2025). Class imbalance in fraud detection: Deep learning and resampling strategies. Expert Systems with Applications, 235, 121201.
    4. Fariha, M., Ahmed, S., & Chowdhury, R. (2025). AI-driven fraud detection in the era of digital payments: Trends and challenges. Computers & Security, 138, 103599.
    5. Höppner, S., Maier, M., & Ziegler, S. (2020). Adaptive machine learning frameworks for real-time fraud detection in financial systems. IEEE Transactions on Neural Networks and Learning Systems, 31(12), 5229–5242. https://doi.org/10.1109/TNNLS.2020.3045307.
    6. Majumder, A. (2025). Advancing fraud detection: Concept drift and adaptive machine learning in financial transaction monitoring. Decision Support Systems, 178, 114048.
    7. Showalter, M., & Wu, D. (2019). Automated fraud detection: A machine learning approach. Journal of Banking and Financial Technology, 3(2), 87–102.
    8. Verma, P., & Dhar, V. (2024). Concept drift-aware fraud detection models: Challenges and future directions. Information Systems Frontiers, 26(3), 741–757.
    9. Xia, Y., & Saha, R. (2025). Gradient boosting and ensemble learning for imbalanced credit card fraud detection. Applied Intelligence, 55(4), 928–944.
    10. Yazıcı, M. (2020). Class imbalance in machine learning: Implications for financial fraud detection. Journal of Financial Data Science, 2(4), 65–79.
    11. Zarzà, S., Gómez, J., & Lozano, A. (2023). Hybrid anomaly detection and classification methods for fraud prevention in financial transactions. Expert Systems with Applications, 220, 119676.
  • Downloads

  • How to Cite

    Tunca, S. P. D. M. ., Balcioglu, Y. S. P. D. ., Cerasi, C. C. P. D. ., & Bayraktar, U. P. D. . (2025). Machine Learning Approaches for Credit Card Fraud Detection‎in Severely Imbalanced Datasets: A Comparative Analysisof ‎Classification and Anomaly Detection Methods. International Journal of Basic and Applied Sciences, 14(5), 593-602. https://doi.org/10.14419/m6x6fn74