An Improved Ensemble Based Technique for Handling Noisy Class Imbalnced Education Data for Prediction of Students Dropout in Hei
-
https://doi.org/10.14419/z3vc6t60
Received date: May 28, 2025
Accepted date: June 27, 2025
Published date: July 8, 2025
-
Educational Data Mining (EDM); Student Performance; Imbalanced Data; Class Imbalance; Oversampling; Prediction, and Ensemble Model. -
Abstract
Increasingly, the sector of education is becoming more interested in the creation of intelligent technology. The fast rise of educational data suggests that standard processing methods may have limitations and may even result in distortion. It is for this reason that the process of reconstructing the research technique of data mining in the field of education has become increasingly important. The amount of information about students that is stored in educational databases is growing on a daily basis; thus, the information that is extracted from these databases needs to be updated on a consistent basis. In a scenario in which there is a requirement to manage a constant flow of student data, there is a challenge of figuring out how to manage this enormous volume of data into the information and how to adapt new knowledge that is introduced with the new data. When working with classes that have few instances, a class imbalance issue is crucial. The machine learning classification of classes is significantly impacted by noisy, class-unbalanced datasets. This research proposes an enhanced hybrid bag-boost model using a suggested resampling technique. A suggested resampling method for addressing noisy, unbalanced datasets is included in this model. The suggested resampling method includes Edited Nearest Neighbor (ENN) and K-Means SMOTE (Synthetic Minority Oversampling Technique) as an oversampling method. The technique of Undersampling is employed to eliminate noise. Three levels of noise reduction are achieved with this resampling technique: first, datasets are clustered using the K-Means clustering technique; second, imbalance is handled by SMOTE inside clusters, which introduces synthetic instances of the class in the minority; and third, instances that generate noise are removed using the ENN technique. The suggested model outperforms the others, according to experimental data. Furthermore, it has been verified that the suggested method works better in binary unbalanced datasets when the noise proportion is raised.
-
References
- Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining educational data to predict student’s academic performance using ensemble methods. In-ternational Journal of Database Theory and Application, 9(8), 119–136. https://doi.org/10.14257/ijdta.2016.9.8.13.
- Aggarwal, D., Mittal, S., & Bali, V. (2021). Significance of non-academic parameters for predicting student performance using ensemble learning techniques. International Journal of System Dynamics Applications (IJSDA), 10(3), 38–49. https://doi.org/10.4018/IJSDA.2021070103.
- Pandey, M., &Taruna, S. (2018). An ensemble-based decision support system for the students’ academic performance prediction. In ICT Based In-novations (pp. 163-169). Springer, Singapore. https://doi.org/10.1007/978-981-10-6602-3_16.
- Devasia, T., Vinushree, T. P., &Hegde, V. (2016, March). Prediction of students performance using Educational Data Mining. In 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE) (pp. 91-95). IEEE. https://doi.org/10.1109/SAPIENCE.2016.7684167.
- Adekitan, A. I., &Salau, O. (2020). Toward an improved learning process: the relevance of ethnicity to data mining prediction of students’ perfor-mance. SN Applied Sciences, 2(1), 1-15. https://doi.org/10.1007/s42452-019-1752-1.
- Shingari, I., Kumar, D., & Khetan, M. (2017). A review of applications of data mining techniques for prediction of students’ performance in higher education. Journal of Statistics and Management Systems, 20(4), 713-722. https://doi.org/10.1080/09720510.2017.1395191.
- Han, M., Tong, M., Chen, M., Liu, J., & Liu, C. (2017, July). Application of Ensemble Algorithm in Students' Performance Prediction. In 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI) (pp. 735-740). IEEE. https://doi.org/10.1109/IIAI-AAI.2017.73.
- Livieris, I. E., Drakopoulou, K., Mikropoulos, T. A., Tampakas, V., &Pintelas, P. (2018). An ensemble-based semi-supervised approach for predict-ing students’ performance. In Research on e-Learning and ICT in Education (pp. 25-42). Springer, Cham. https://doi.org/10.1007/978-3-319-95059-4_2.
- Rao, B. M., & Murthy, B. R. (2020). Prediction of student’s educational performance using machine learning techniques. In Data Engineering and Communication Technology (pp. 429-440). Springer, Singapore. https://doi.org/10.1007/978-981-15-1097-7_36.
- Ade, R. (2019). Students performance prediction using hybrid classifier technique in incremental learning. International Journal of Business Intelli-gence and Data Mining, 15(2), 173-189. https://doi.org/10.1504/IJBIDM.2019.101255.
- Kumari, P., Jain, P. K., &Pamula, R. (2018, March). An efficient use of ensemble methods to predict students academic performance. In 2018 4th International Conference on Recent Advances in Information Technology (RAIT) (pp. 1-6). IEEE. https://doi.org/10.1109/RAIT.2018.8389056.
- Pandey, M., &Taruna, S. (2014). A comparative study of ensemble methods for students' performance modeling. International Journal of Computer Applications, 103(8). https://doi.org/10.5120/18095-9151.
- Hassan, H., Anuar, S., & Ahmad, N. B. (2019, May). Students’ performance prediction model using meta-classifier approach.In International Con-ference on Engineering Applications of Neural Networks (pp. 221-231). Springer, Cham. https://doi.org/10.1007/978-3-030-20257-6_19.
- Ajibade, S. S. M., Ahmad, N. B. B., &Shamsuddin, S. M. (2019, August). Educational data mining: enhancement of student performance model using ensemble methods. In IOP Conference Series: Materials Science and Engineering (Vol. 551, No. 1, p. 012061). IOP Publishing. https://doi.org/10.1088/1757-899X/551/1/012061.
- Nespereira, C. G., Elhariri, E., El-Bendary, N., Vilas, A. F., & Redondo, R. P. D. (2016). Machine learning based classification approach for pre-dicting students performance in blended learning. In The 1st International Conference on Advanced Intelligent System and Informatics (AI-SI2015), November 28-30, 2015, BeniSuef, Egypt (pp. 47-56). Springer, Cham. https://doi.org/10.1007/978-3-319-26690-9_5.
- Adejo, O. W., & Connolly, T. (2018). Predicting student academic performance using multi-model heterogeneous ensemble approach. Journal of Applied Research in Higher Education. https://doi.org/10.1108/JARHE-09-2017-0113.
- Abdullah, D. (2020). A linear antenna array for wireless communications. National Journal of Antennas and Propagation, 2(1), 19–24. https://doi.org/10.31838/NJAP/02.01.04.
- Barhoumi, E. M., Charabi, Y., & Farhani, S. (2024). Detailed guide to machine learning techniques in signal processing. Progress in Electronics and Communication Engineering, 2(1), 39–47.
- Parizi, L., Dobrigkeit, J., & Wirth, K. (2025). Trends in software development for embedded systems in cyber-physical systems. SCCTS Journal of Embedded Systems Design and Applications, 2(1), 57–66.
-
Downloads
-
How to Cite
Sangeetha, M. S. ., & Shanmugapriya , D. S. . (2025). An Improved Ensemble Based Technique for Handling Noisy Class Imbalnced Education Data for Prediction of Students Dropout in Hei. International Journal of Basic and Applied Sciences, 14(SI-1), 258-263. https://doi.org/10.14419/z3vc6t60
