Quantumboost: Leveraging Parameterized Quantum Circuits ‎for Imbalanced Dataset Oversampling

  • Authors

    • D. Gowtham Chakravarthy Assistant professor, Sri Eshwar College of Engineering, Coimbatore- 641202‎
    • S. Kannimuthu Professor, Karpagam College of Engineering, Coimbatore- 641032
    • P. M. Arunkumar Associate Professor, Karpagam College of Engineering, Coimbatore- 641032
    https://doi.org/10.14419/4qgehh16

    Received date: August 26, 2025

    Accepted date: September 30, 2025

    Published date: October 14, 2025

  • Imbalanced Datasets; Machine Learning; Oversampling; Parameterized Quantum Circuits; Quantum Computing; SMOTE‎.
  • Abstract

    Imbalanced datasets in machine learning often result in biased models, particularly in classification problems when certain classes are un‎der-represented. Traditional oversampling techniques like Synthetic Minority Over-sampling Technique (SMOTE), although popular, have ‎difficulty producing synthetic samples of high quality when working in high-dimensional spaces or non-linear feature spaces. Existing techniques, such as SMOTE, do not fully account for the inherent complexity of non-linear structures of a minority class's distribution, which can ‎produce negatively biased classifiers with poor generalization and the risk of overfitting to the minority class. This is further complicated by ‎the reliance of traditional oversampling techniques on linear interpolation, which limits the possibilities of generating realistic or diverse and ‎non-redundant synthetic samples. The proposed technique, QGOPQC, describes Quantum Generative Oversampling using Parameterized ‎Quantum Circuits (PQCs). This approach describes using a quantum circuit, utilizing a novel implementation of a PQCs to overcome the non-‎linear problem. The development of synthetic samples of high fidelity using parameterized quantum circuits would better reflect the true data ‎manifold of the generated class, allowing for a better learning process in a quantum training process, and using quantum state encoding, quantum circuit training, and classical optimization to produce diverse samples of synthetic classes while not compromising on information that ‎maintains feature correlations. Demonstrated through a variety of results, the method achieved consistently higher AUC scores, reaching up to ‎‎0.92 compared to SMOTE’s 0.53, especially in high-dimensional scenarios. The proposed process of using QGOPQC has had a related beneficial effect on classifier performance when trained on imbalanced data, allowing for the process of generating non-redundant high-quality ‎samples through a quantum method of sampling‎.

  • References

    1. Yang, B., Tian, G., Luttrell, J., Gong, P., & Zhang, C. (2023). A quantum-based oversampling method for classification of highly imbalanced and overlapped data. Experimental Biology and Medicine, 248(24), 2500-2513. https://doi.org/10.1177/15353702231220665.
    2. Chao, S., Yang, G., & Nie, M. (2023). Hybrid continuous variational quantum neural networks for network intrusion detection. In Industrial Engineering and Applications (pp. 366-378). IOS Press. https://doi.org/10.3233/ATDE230062.
    3. Azevedo, V., Silva, C., & Dutra, I. (2022). Quantum transfer learning for breast cancer detection. Quantum Machine Intelligence, 4(1), 5. https://doi.org/10.1007/s42484-022-00062-4.
    4. Eltayeb, R., Karrar, A. E., Osman, W. I., & Mutasim, M. (2023). Handling imbalanced data through Re-sampling: systematic review. Indonesian Journal of Electrical Engineering and Informatics (IJEEI), 11(2), 503-514. https://doi.org/10.52549/.v11i2.4471.
    5. Bej, S., Davtyan, N., Wolfien, M., Nassar, M., & Wolkenhauer, O. (2021). LoRAS: An oversampling approach for imbalanced datasets. Machine Learning, 110(2), 279-301. https://doi.org/10.1007/s10994-020-05913-4.
    6. Yang, Y. F., & Sun, M. (2022). Semiconductor defect detection by hybrid classical-quantum deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2323-2332). https://doi.org/10.1109/CVPR52688.2022.00236.
    7. Shelke, M. S., Deshmukh, P. R., & Shandilya, V. K. (2017). A review on imbalanced data handling using undersampling and oversampling technique. Int. J. Recent Trends Eng. Res, 3(4), 444-449. https://doi.org/10.23883/IJRTER.2017.3168.0UWXM.
    8. Gosain, A., & Sardana, S. (2017, September). Handling class imbalance problem using oversampling techniques: A review. In 2017 international conference on advances in computing, communications and informatics (ICACCI) (pp. 79-85). IEEE. https://doi.org/10.1109/ICACCI.2017.8125820.
    9. Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020, April). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems (ICICS) (pp. 243-248). IEEE. https://doi.org/10.1109/ICICS49469.2020.239556.
    10. Yang, Y., Khorshidi, H. A., & Aickelin, U. (2024). A review on over-sampling techniques in classification of multi-class imbalanced datasets: insights for medical problems. Frontiers in digital health, 6, 1430245. https://doi.org/10.3389/fdgth.2024.1430245.
    11. Mujahid, M., Kına, E. R. O. L., Rustam, F., Villar, M. G., Alvarado, E. S., De La Torre Diez, I., & Ashraf, I. (2024). Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering. Journal of Big Data, 11(1), 87. https://doi.org/10.1186/s40537-024-00943-4.
    12. Sharma, S., Gosain, A., & Jain, S. (2021, August). A review of the oversampling techniques in class imbalance problem. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2021, Volume 1 (pp. 459-472). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-16-2594-7_38.
    13. Wibowo, P., & Fatichah, C. (2021). An in-depth performance analysis of the oversampling techniques for high-class imbalanced dataset. Register: Jurnal Ilmiah Teknologi Sistem Informasi, 7(1), 63-71. https://doi.org/10.26594/register.v7i1.2206.
    14. Shamsudin, H., Yusof, U. K., Jayalakshmi, A., & Khalid, M. N. A. (2020, October). Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset. In 2020 IEEE 16th international conference on control & automation (ICCA) (pp. 803-808). IEEE. https://doi.org/10.1109/ICCA51439.2020.9264517.
    15. Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEe Access, 10, 47643-47660. https://doi.org/10.1109/ACCESS.2022.3169512.
    16. https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.
  • Downloads

  • How to Cite

    Chakravarthy, D. G. ., Kannimuthu, S. ., & Arunkumar, P. M. . (2025). Quantumboost: Leveraging Parameterized Quantum Circuits ‎for Imbalanced Dataset Oversampling. International Journal of Basic and Applied Sciences, 14(6), 282-288. https://doi.org/10.14419/4qgehh16