Advancements in Voice Spoofing Detection: A Comprehensive ‎Review

  • Authors

    • Rekha Rani Research Scholar, Department of Computer Science & Applications, Maharshi Dayanand ‎University, Rohtak, Haryana, India
    • Bal Kishan Associate Professor, Department of Computer Science & Applications, Maharshi Dayanand ‎University, Rohtak, Haryana, India
    https://doi.org/10.14419/w36qrh81

    Received date: July 13, 2025

    Accepted date: August 27, 2025

    Published date: September 7, 2025

  • ASV; Spoofing Attacks; Adversarial Attack; Machine Learning; Countermeasures
  • Abstract

    The reliability of voice-based authentication has increased with the adoption of voice-controlled technologies ‎and digital transactions. Automatic Speaker Verification (ASV) provides a dependable approach due to its ‎special capacity to confirm identity based on speech. ASV is mostly used in telecommunications, banking, law ‎enforcement, and smart assistants to increase security and user comfort. However, spoofing attacks like voice ‎conversion and speech synthesis are increasingly targeting these systems, making them less compatible, ‎examining responses to new kinds of attacks through data augmentation, and highlighting the role of transfer ‎learning in improving detection even when there is a lack of data. This review discusses the importance of ‎strengthening ASV systems with data augmentation to address new threats, transfer learning to enhance ‎detection with limited data, and adaptive models to keep up with advancing spoofing attacks‎.

  • References

    1. A. Mittal and M. Dua, “Automatic speaker verification systems and spoof detection techniques: review and analysis,” Int. J. Speech Technol., vol. 25, no. 1, pp. 105–134, Mar. 2022, https://doi.org/10.1007/s10772-021-09876-2.
    2. J. Boyd, M. Fahim, and O. Olukoya, “Voice spoofing detection for multiclass attack classification using deep learning,” Mach. Learn. Appl., vol. 14, p. 100503, Dec. 2023, https://doi.org/10.1016/j.mlwa.2023.100503.
    3. M. Sajjad et al., “CNN-based anti-spoofing two-tier multi-factor authentication system,” Pattern Recognit. Lett., vol. 126, pp. 123–131, Sep. 2019, https://doi.org/10.1016/j.patrec.2018.02.015
    4. N. Shome, A. Sarkar, A. K. Ghosh, R. H. Laskar, and R. Kashyap, “Speaker Recognition through Deep Learning Techniques: A Comprehensive Re-view and Research Challenges,” Period. Polytech. Electr. Eng. Comput. Sci., vol. 67, no. 3, pp. 300–336, Jul. 2023, https://doi.org/10.3311/PPee.20971.
    5. A. Javed, K. M. Malik, H. Malik, and A. Irtaza, “Voice spoofing detector: A unified anti-spoofing framework,” Expert Syst. Appl., vol. 198, p. 116770, Jul. 2022, https://doi.org/10.1016/j.eswa.2022.116770
    6. E.-S. Atlam, M. Almaliki, G. Elmarhomy, A. M. Almars, A. M. A. Elsiddieg, and R. ElAgamy, “SLM-DFS: A systematic literature map of deepfake spread on social media,” Alex. Eng. J., vol. 111, pp. 446–455, Jan. 2025, https://doi.org/10.1016/j.aej.2024.10.076.
    7. B. Chettri, “The Clever Hans Effect in Voice Spoofing Detection,” in 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar: IEEE, Jan. 2023, pp. 577–584. https://doi.org/10.1109/SLT54892.2023.10022624.
    8. Y. Ren, H. Peng, L. Li, X. Xue, Y. Lan, and Y. Yang, “A voice spoofing detection framework for IoT systems with feature pyramid and online knowledge distillation,” J. Syst. Archit., vol. 143, p. 102981, Oct. 2023, https://doi.org/10.1016/j.sysarc.2023.102981.
    9. J. Zhou, T. Hai, D. N. A. Jawawi, D. Wang, E. Ibeke, and C. Biamba, “Voice spoofing countermeasure for voice replay attacks using deep learning,” J. Cloud Comput., vol. 11, no. 1, p. 51, Sep. 2022, https://doi.org/10.1186/s13677-022-00306-5.
    10. R. Baumann, K. M. Malik, A. Javed, A. Ball, B. Kujawa, and H. Malik, “Voice spoofing detection corpus for single and multi-order audio replays,” Comput. Speech Lang., vol. 65, p. 101132, Jan. 2021, https://doi.org/10.1016/j.csl.2020.101132
    11. J. Guo, Y. Zhao, and H. Wang, “Generalized Spoof Detection and Incremental Algorithm Recognition for Voice Spoofing,” Appl. Sci., vol. 13, no. 13, p. 7773, Jun. 2023, https://doi.org/10.3390/app13137773.
    12. R. Mohd Hanifa, K. Isa, and S. Mohamad, “A review on speaker recognition: Technology and challenges,” Computer Electr. Eng., vol. 90, p. 107005, Mar. 2021, https://doi.org/10.1016/j.compeleceng.2021.107005
    13. H. Meng, W. Ou, J. Huang, H. Liang, W. Han, and Q. Zhang, “A robust unified spoofing audio detection scheme,” Computer Electr. Eng., vol. 122, p. 109974, Mar. 2025, https://doi.org/10.1016/j.compeleceng.2024.109974.
    14. R. Rahmeni, A. B. Aicha, and Y. B. Ayed, “Speech spoofing countermeasures based on source voice analysis and machine learning techniques,” Pro-cedia Comput. Sci., vol. 159, pp. 668–675, 2019, https://doi.org/10.1016/j.procs.2019.09.222
    15. H. Liang, X. Lin, Q. Zhang, and X. Kang, “Recognition of spoofed voice using convolutional neural networks,” in 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC: IEEE, Nov. 2017, pp. 293–297. https://doi.org/10.1109/GlobalSIP.2017.8308651.
    16. J. Monteiro, J. Alam, and T. H. Falk, “Generalized end-to-end detection of spoofing attacks to automatic speaker recognizers,” Comput. Speech Lang., vol. 63, p. 101096, Sep. 2020, https://doi.org/10.1016/j.csl.2020.101096.
    17. A. Chadha, A. Abdullah, L. Angeline, and S. Sivanesan, “A review on state-of-the-art Automatic Speaker verification system from spoofing and anti-spoofing perspective,” Indian J. Sci. Technol., vol. 14, no. 40, pp. 3026–3050, Oct. 2021, https://doi.org/10.17485/IJST/v14i40.1279
    18. H. Tak, J. Patino, M. Todisco, et al., “End-to-end antispoofing with rawnet2,” in Proc. ICASSP, 2021, pp. 6369–6373. https://doi.org/10.1109/ICASSP39728.2021.9414234
    19. M. Barhoush, A. Hallawa, and A. Schmeink, “Speaker identification and localization using shuffled MFCC features and deep learning,” Int. J. Speech Technol., vol. 26, no. 1, pp. 185–196, Mar. 2023, https://doi.org/10.1007/s10772-023-10023-2.
    20. M. A. Basit, C. Liu, and E. Zhao, “SDI: A tool for speech differentiation in user identification,” Expert Syst. Appl., vol. 243, p. 122866, Jun. 2024, https://doi.org/10.1016/j.eswa.2023.122866
    21. School of Computer Science and Engineering, Taylors University, 47500, Selangor, Malaysia, A. Chadha, A. Abdullah, L. Angeline, and S. Siva-nesan, “A review on state-of-the-art Automatic Speaker verification system from spoofing and anti-spoofing perspective,” Indian J. Sci. Technol., vol. 14, no. 40, pp. 3026–3050, Oct. 2021, https://doi.org/10.17485/IJST/v14i40.1279
    22. S. Sinha, S. Dey, and G. Saha, “Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion,” Comput. Speech Lang., vol. 86, p. 101599, Jun. 2024, https://doi.org/10.1016/j.csl.2023.101599
    23. Z. Almutairi and H. Elgibreen, “A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions, “Algorithms 2022, 15, 155. https://doi.org/10.3390/a15050155.
    24. A. Javed, K. M. Malik, A. Irtaza, and H. Malik, “Towards protecting cyber-physical and IoT systems from single- and multi-order voice spoofing at-tacks,” Appl. Acoust., vol. 183, p. 108283, Dec. 2021, https://doi.org/10.1016/j.apacoust.2021.108283.
    25. M. R. Kamble, H. B. Sailor, H. A. Patil, and H. Li, “Advances in anti-spoofing: from the perspective of ASVspoof challenges,” APSIPA Trans. Signal Inf. Process., vol. 9, no. 1, 2020, https://doi.org/10.1017/ATSIP.2019.21
    26. A. Cohen, I. Rimon, E. Aflalo, and H. H. Permuter, “A study on data augmentation in voice anti-spoofing,” Speech Commun., vol. 141, pp. 56–67, Jun. 2022, https://doi.org/10.1016/j.specom.2022.04.005.
    27. I. Himawan, F. Villavicencio, S. Sridharan, and C. Fookes, “Deep domain adaptation for anti-spoofing in speaker verification systems,” Comput. Speech Lang., vol. 58, pp. 377–402, Nov. 2019, https://doi.org/10.1016/j.csl.2019.05.007
    28. A. T. Patil, H. A. Patil, and K. Khoria, “Effectiveness of energy separation-based instantaneous frequency estimation for cochlear cepstral features for synthetic and voice-converted spoofed speech detection,” Comput. Speech Lang., vol. 72, p. 101301, Mar. 2022, https://doi.org/10.1016/j.csl.2021.101301.
    29. P. Gupta, H. Patil, and R. Guido, “Vulnerability issues in Automatic Speaker Verification (ASV) systems,” EURASIP Journal on Audio, Speech, and Music Processing (2024) 2024:10, https://doi.org/10.1186/s13636-024-00328-8.
    30. P. Abdzadeh and H. Veisi, “A Comparison of CQT Spectrogram with STFT-based Acoustic Features in Deep Learning-based Synthetic Speech De-tection,” J. AI Data Min., vol. 11, no. 1, Jan. 2023.
    31. J. Xue, H. Zhou, H. Song, B. Wu, and L. Shi, “Cross-modal information fusion for voice spoofing detection,” Speech Commun., vol. 147, pp. 41–50, Feb. 2023, https://doi.org/10.1016/j.specom.2023.01.001.
    32. X. Wang and J. Yamagishi, “A comparative study on recent neural spoofing countermeasures for synthetic speech detection,” in Proc. Interspeech, 2021, pp. 4259– 4263. https://doi.org/10.21437/Interspeech.2021-702
    33. L. Nguyen, M. Bui et al., “On the Defense of Spoofing Countermeasures Against Adversarial Attacks,” IEEE Access, date of publication 31 August 2023, Digital Object Identifier https://doi.org/10.1016/j.jksuci.2022.02.024.
    34. H. Dawood, S. Saleem, F. Hassan, and A. Javed, “A robust voice spoofing detection system using novel CLS-LBP features and LSTM,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 9, pp. 7300–7312, Oct. 2022, https://doi.org/10.1016/j.patrec.2011.06.011.
    35. Y. W. Wong et al., “A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities,” Pattern Recognit. Lett., vol. 32, no. 13, pp. 1503–1510, Oct. 2011, https://doi.org/10.1016/j.patrec.2011.06.011.
    36. R. Rahmeni, A. B. Aicha, and Y. B. Ayed, “Acoustic features exploration and examination for voice spoofing countermeasures with boosting machine learning techniques,” Procedia Comput. Sci., vol. 176, pp. 1073–1082, 2020, https://doi.org/10.1016/j.procs.2020.09.103.
    37. Y. Huang, Q. Shen, and J. Ma, “AFP-Conformer: Asymptotic feature pyramid conformer for spoofing speech detection,” Speech Commun., vol. 166, p. 103149, Jan. 2025, https://doi.org/10.1016/j.specom.2024.103149.
    38. P. Ziabari and H. Veisi, “A Comparison of CQT Spectrogram with STFT-based Acoustic Features in Deep Learning-based Synthetic Speech Detec-tion, “Journal of Artificial Intelligence and Data Mining (JAIDM), Vol. 11, No. 1, 2023, 119-129.
    39. M. Faundez-Zanuy, M. Hagmüller, and G. Kubin, “Speaker identification security improvement by means of speech watermarking,” Pattern Recog-nit., vol. 40, no. 11, pp. 3027–3034, Nov. 2007, https://doi.org/10.1016/j.patcog.2007.02.016.
    40. B. Wickramasinghe, E. Ambikairajah, V. Sethu, J. Epps, H. Li, and T. Dang, “DNN controlled adaptive front-end for replay attack detection systems,” Speech Commun., vol. 154, p. 102973, Oct. 2023, https://doi.org/10.1016/j.specom.2023.102973.
  • Downloads

  • How to Cite

    Rani, R., & Kishan , B. . (2025). Advancements in Voice Spoofing Detection: A Comprehensive ‎Review. International Journal of Basic and Applied Sciences, 14(5), 224-235. https://doi.org/10.14419/w36qrh81