Quantumboost: Leveraging Parameterized Quantum Circuits for Imbalanced Dataset Oversampling
-
https://doi.org/10.14419/4qgehh16
Received date: August 26, 2025
Accepted date: September 30, 2025
Published date: October 14, 2025
-
Imbalanced Datasets; Machine Learning; Oversampling; Parameterized Quantum Circuits; Quantum Computing; SMOTE. -
Abstract
Imbalanced datasets in machine learning often result in biased models, particularly in classification problems when certain classes are under-represented. Traditional oversampling techniques like Synthetic Minority Over-sampling Technique (SMOTE), although popular, have difficulty producing synthetic samples of high quality when working in high-dimensional spaces or non-linear feature spaces. Existing techniques, such as SMOTE, do not fully account for the inherent complexity of non-linear structures of a minority class's distribution, which can produce negatively biased classifiers with poor generalization and the risk of overfitting to the minority class. This is further complicated by the reliance of traditional oversampling techniques on linear interpolation, which limits the possibilities of generating realistic or diverse and non-redundant synthetic samples. The proposed technique, QGOPQC, describes Quantum Generative Oversampling using Parameterized Quantum Circuits (PQCs). This approach describes using a quantum circuit, utilizing a novel implementation of a PQCs to overcome the non-linear problem. The development of synthetic samples of high fidelity using parameterized quantum circuits would better reflect the true data manifold of the generated class, allowing for a better learning process in a quantum training process, and using quantum state encoding, quantum circuit training, and classical optimization to produce diverse samples of synthetic classes while not compromising on information that maintains feature correlations. Demonstrated through a variety of results, the method achieved consistently higher AUC scores, reaching up to 0.92 compared to SMOTE’s 0.53, especially in high-dimensional scenarios. The proposed process of using QGOPQC has had a related beneficial effect on classifier performance when trained on imbalanced data, allowing for the process of generating non-redundant high-quality samples through a quantum method of sampling.
-
References
- Yang, B., Tian, G., Luttrell, J., Gong, P., & Zhang, C. (2023). A quantum-based oversampling method for classification of highly imbalanced and overlapped data. Experimental Biology and Medicine, 248(24), 2500-2513. https://doi.org/10.1177/15353702231220665.
- Chao, S., Yang, G., & Nie, M. (2023). Hybrid continuous variational quantum neural networks for network intrusion detection. In Industrial Engineering and Applications (pp. 366-378). IOS Press. https://doi.org/10.3233/ATDE230062.
- Azevedo, V., Silva, C., & Dutra, I. (2022). Quantum transfer learning for breast cancer detection. Quantum Machine Intelligence, 4(1), 5. https://doi.org/10.1007/s42484-022-00062-4.
- Eltayeb, R., Karrar, A. E., Osman, W. I., & Mutasim, M. (2023). Handling imbalanced data through Re-sampling: systematic review. Indonesian Journal of Electrical Engineering and Informatics (IJEEI), 11(2), 503-514. https://doi.org/10.52549/.v11i2.4471.
- Bej, S., Davtyan, N., Wolfien, M., Nassar, M., & Wolkenhauer, O. (2021). LoRAS: An oversampling approach for imbalanced datasets. Machine Learning, 110(2), 279-301. https://doi.org/10.1007/s10994-020-05913-4.
- Yang, Y. F., & Sun, M. (2022). Semiconductor defect detection by hybrid classical-quantum deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2323-2332). https://doi.org/10.1109/CVPR52688.2022.00236.
- Shelke, M. S., Deshmukh, P. R., & Shandilya, V. K. (2017). A review on imbalanced data handling using undersampling and oversampling technique. Int. J. Recent Trends Eng. Res, 3(4), 444-449. https://doi.org/10.23883/IJRTER.2017.3168.0UWXM.
- Gosain, A., & Sardana, S. (2017, September). Handling class imbalance problem using oversampling techniques: A review. In 2017 international conference on advances in computing, communications and informatics (ICACCI) (pp. 79-85). IEEE. https://doi.org/10.1109/ICACCI.2017.8125820.
- Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020, April). Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th international conference on information and communication systems (ICICS) (pp. 243-248). IEEE. https://doi.org/10.1109/ICICS49469.2020.239556.
- Yang, Y., Khorshidi, H. A., & Aickelin, U. (2024). A review on over-sampling techniques in classification of multi-class imbalanced datasets: insights for medical problems. Frontiers in digital health, 6, 1430245. https://doi.org/10.3389/fdgth.2024.1430245.
- Mujahid, M., Kına, E. R. O. L., Rustam, F., Villar, M. G., Alvarado, E. S., De La Torre Diez, I., & Ashraf, I. (2024). Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering. Journal of Big Data, 11(1), 87. https://doi.org/10.1186/s40537-024-00943-4.
- Sharma, S., Gosain, A., & Jain, S. (2021, August). A review of the oversampling techniques in class imbalance problem. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2021, Volume 1 (pp. 459-472). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-16-2594-7_38.
- Wibowo, P., & Fatichah, C. (2021). An in-depth performance analysis of the oversampling techniques for high-class imbalanced dataset. Register: Jurnal Ilmiah Teknologi Sistem Informasi, 7(1), 63-71. https://doi.org/10.26594/register.v7i1.2206.
- Shamsudin, H., Yusof, U. K., Jayalakshmi, A., & Khalid, M. N. A. (2020, October). Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset. In 2020 IEEE 16th international conference on control & automation (ICCA) (pp. 803-808). IEEE. https://doi.org/10.1109/ICCA51439.2020.9264517.
- Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEe Access, 10, 47643-47660. https://doi.org/10.1109/ACCESS.2022.3169512.
- https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.
-
Downloads
-
How to Cite
Chakravarthy, D. G. ., Kannimuthu, S. ., & Arunkumar, P. M. . (2025). Quantumboost: Leveraging Parameterized Quantum Circuits for Imbalanced Dataset Oversampling. International Journal of Basic and Applied Sciences, 14(6), 282-288. https://doi.org/10.14419/4qgehh16
