PCA-Based Model to Enhance The Performance of DataClustering Using A Metaheuristic Algorithm
-
https://doi.org/10.14419/gcjafw37
Received date: July 15, 2025
Accepted date: August 25, 2025
Published date: September 9, 2025
-
Clustering; Preprocessing; Enhanced Grey Wolf Optimizer; PCA, Homogeneity Level. -
Abstract
This paper explores the enhancement of clustering techniques through the integration of preprocessing methods and nature-inspired algorithms. Achieving perfection in making clusters from the given dataset is a crucial task. Many algorithms have been proposed in the literature to make useful clusters of raw data. But due to the day-by-day changing properties and complexity of data, like multidimensional data values, increasing size, and many more parameters, they still require lots of enhancements in clustering techniques. By addressing the limitations of traditional clustering approaches, the research emphasizes the importance of pre-processing methodologies like PCA and KNN to improve clustering outcomes. The limitation of handling high-dimensional data by an optimizer algorithm is addressed by PCA, which improves feature examination of the dataset at a low level for further processes. Additionally, it examines various nature-inspired algorithms that can be applied to clustering tasks, demonstrating their efficacy in optimizing and refining clustering results. Overall, the study presents a framework combining KNN+PCA+EGWO that combines these elements to achieve superior clustering performance, i.e., increasing the result performance by 45-50 % and the effects of preprocessing steps are also shown with low error and other great parameters present in the paper.
-
References
- Raman R, Kumar V, Pillai BG, Rabadiya D, Patre S, Meenakshi R. The impact of enhancing the k-means algorithm through genetic algorithm op-timization on high-dimensional data clustering outcomes. In: 2024 International Conference on Knowledge Engineering and Communication Sys-tems (ICKECS). IEEE; 2024. p. 1–5. https://doi.org/10.1109/ICKECS61492.2024.10617268.
- Zhang X, Lin Q, Mao W, Liu S, Dou Z, Liu G. Hybrid Particle Swarm and Grey Wolf Optimizer and its application to clustering optimization. Appl Soft Comput. 2021;101(107061):107061. Available from: https://doi.org/10.1016/j.asoc.2020.107061.
- Song X, Zhang X, Zhao M. Improved artificial bee colony algorithm embedded with differential evolution operator. In: 2024 9th International Conference on Electronic Technology and Information Science (ICETIS). IEEE, 2024. https://doi.org/10.1109/ICETIS61828.2024.10593697.
- Guo H, Liu Q, Dang Z. Optimization of a two-dimensional path optimization algorithm based on Dijkstra ant colony optimization algorithm. In: 2024 2nd International Conference on Signal Processing and Intelligent Computing (SPIC) IEEE; 2024. p. 399–402. https://doi.org/10.1109/SPIC62469.2024.10691583.
- Yue J, Arimuzha. Big data optimization clustering algorithm for power grid CPS based on PSO. In: 2024 Second International Conference on Data Science and Information System (ICDSIS). IEEE, 2024. https://doi.org/10.1109/ICDSIS61070.2024.10594360.
- Suganya D, Sugumar R. PSO-optimized CNN for feature extraction and accurate classification of satellite images using machine learning. In: 2024 International Conference on Computing and Data Science (ICCDS). IEEE; 2024. https://doi.org/10.1109/ICCDS60734.2024.10560453.
- Liang Z, Wang Z, Wang Y, Yan X. A study of trajectory planning of medical robots based on particle swarm optimization (PSO) algorithm. In: 2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI). IEEE; 2024. https://doi.org/10.1109/ICETCI61221.2024.10594551.
- Ikotun AM, Ezugwu AE, Abualigah L, Abuhaija B, Heming J. K-means clustering algorithms: A comprehensive review, variants analysis, and ad-vances in the era of big data. Inf Sci (Ny)]. 2023;622:178–210. Available from: https://doi.org/10.1016/j.ins.2022.11.139.
- Wang C, Yang N, Xu W, Wang J, Sun J, Chen X. Research on a text data preprocessing method suitable for clustering algorithm. In: 2022 3rd In-ternational Conference on Information Science, Parallel and Distributed Systems (ISPDS). IEEE; 2022. https://doi.org/10.1109/ISPDS56360.2022.9874172.
- Shial G, Sahoo S, Panigrahi S. An enhanced GWO algorithm with improved explorative search capability for global optimization and data cluster-ing. Appl Artif Intell ]. 2023;37(1). Available from: https://doi.org/10.1080/08839514.2023.2166232.
- Rahmani AM, Haider A, Ali S, Mohammadi M, Mehranzadeh A, Khoshvaght P, et al. A routing approach based on combination of gray wolf clus-tering and fuzzy clustering and using multi-criteria decision making approaches for WSN-IoT. Comput Electr Eng ]. 2025;122(109946):109946. Available from: https://doi.org/10.1016/j.compeleceng.2024.109946.
- Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw]. 2014;69:46–61. Available from: https://doi.org/10.1016/j.advengsoft.2013.12.007.
- Rashaideh H, Sawaie A, Al-Betar MA, Abualigah LM, Al-laham MM, Al-Khatib RM, et al. A grey wolf optimizer for text document clustering. J Intell Syst ]. 2019;29(1):814–30. Available from: https://doi.org/10.1515/jisys-2018-0194.
- Yang Y, Zhang X, Li B, Qin K. A grey wolf optimizer-based topology shaping method for UAV swarm. In: 2022 IEEE 5th International Confer-ence on Electronics Technology (ICET). IEEE; 2022. https://doi.org/10.1109/ICET55676.2022.9824250.
- Ahmadi R, Ekbatanifard G, Bayat P. A modified grey wolf optimizer based data clustering algorithm. Appl Artif Intell ]. 2021;35(1):63–79. Avail-able from: https://doi.org/10.1080/08839514.2020.1842109.
- Kihel BK, Chouraqui S. A Novel Genetic Grey Wolf optimizer for Global optimization and Feature Selection. In: 2020 Second International Con-ference on Embedded & Distributed Systems (EDiS). IEEE; 2020. https://doi.org/10.1109/EDiS49545.2020.9296449.
- Millah IS, Chang PC, Teshome DF, Subroto RK, Lian KL, Lin J-F. An enhanced grey wolf optimization algorithm for photovoltaic maximum power point tracking control under partial shading conditions. IEEE Open J Ind Electron Soc ]. 2022;3:392–408. Available from: https://doi.org/10.1109/OJIES.2022.3179284.
- Jiang K, Ni H, Sun P, Han R. An improved binary grey wolf optimizer for dependent task scheduling in edge computing. In: 2019 21st Interna-tional Conference on Advanced Communication Technology (ICACT). IEEE; 2019. https://doi.org/10.23919/ICACT.2019.8702018.
- Xu H, Liu X, Su J. An improved grey wolf optimizer algorithm integrated with Cuckoo Search. In: 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). IEEE; 2017. https://doi.org/10.1109/IDAACS.2017.8095129.
- Wang W, Ye L, Zhang Y, Li Y. Manual bidirectional dislocation flip chip alignment technology based on image preprocessing. In: 2024 25th Inter-national Conference on Electronic Packaging Technology (ICEPT). IEEE; 2024. p. 1–5. https://doi.org/10.1109/ICEPT63120.2024.10668765.
- Kyriaki K, Koukopoulos D, Fidas CA. A comprehensive survey of EEG preprocessing methods for cognitive load assessment. IEEE Access 2024;12:23466–89. Available from: https://doi.org/10.1109/ACCESS.2024.3360328.
- Murti DMP, Pujianto U, Wibawa AP, Akbar MI. K-nearest neighbor (K-NN) based missing data imputation. In: 2019 5th International Conference on Science in Information Technology (ICSITech). IEEE; 2019. https://doi.org/10.1109/ICSITech46713.2019.8987530.
- Salem N, Hussein S. Data dimensional reduction and principal omponents analysis. Procedia Computer Science. 2019;163:292–9. https://doi.org/10.1016/j.procs.2019.12.111.
- Nanni L, Ghidoni S, Brahnam S. Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recognit 2017;71:158–72. Available from: https://doi.org/10.1016/j.patcog.2017.05.025.
- UCI machine learning repository Uci.edu. [cited 2025 Apr 15]. Available from: https://archive.ics.uci.edu/.
- Nainggolan R, Perangin-angin R, Simarmata E, Tarigan AF. Improved the performance of the K-means cluster using the Sum of Squared Error (SSE) optimized by using the elbow method. J Phys Conf Ser. 2019;1361(1):012015. Available from: https://doi.org/10.1088/1742-6596/1361/1/012015.
- Binu Jose A, Das P. A multi-objective approach for inter-cluster and intra-cluster distance analysis for numeric data. In: Lecture Notes in Networks and Systems. Singapore; Singapore: Springer; 2022. p. 319–32. https://doi.org/10.1007/978-981-19-0707-4_30.
- Li C, Wu T. A boosted clustering algorithm for distributed homogeneous data mining. In: 2006 6th World Congress on Intelligent . control and au-tomation 2006.
-
Downloads
-
How to Cite
Yadav, S. S., & Sutaria, K. . (2025). PCA-Based Model to Enhance The Performance of DataClustering Using A Metaheuristic Algorithm. International Journal of Basic and Applied Sciences, 14(5), 282-288. https://doi.org/10.14419/gcjafw37
