PCA-Based Model to Enhance The Performance of Data‎Clustering Using A Metaheuristic Algorithm

  • Authors

    • Shailendra Singh Yadav Department of Computer Science & Engineering, Parul Institute of Engineering & Technology, Faculty of Engineering and Technology, Parul University,Waghodia Vadodara, India
    • Kamal Sutaria Department of Computer Science & Engineering, Parul Institute of Engineering & Technology, Faculty of Engineering and Technology, Parul University,Waghodia Vadodara, India
    https://doi.org/10.14419/gcjafw37

    Received date: July 15, 2025

    Accepted date: August 25, 2025

    Published date: September 9, 2025

  • Clustering; Preprocessing; Enhanced Grey Wolf Optimizer; PCA, Homogeneity Level.
  • Abstract

    This paper explores the enhancement of clustering techniques through the integration of preprocessing methods and nature-inspired algorithms. Achieving perfection in making clusters from the given dataset is a crucial task. Many algorithms have ‎been proposed in the literature to make useful clusters of raw data. But due to the day-by-day changing properties and ‎complexity of data, like multidimensional data values, increasing size, and many more parameters, they still require lots of ‎enhancements in clustering techniques. By addressing the limitations of traditional clustering approaches, the research ‎emphasizes the importance of pre-processing methodologies like PCA and KNN to improve clustering outcomes. ‎The limitation of handling high-dimensional data by an optimizer algorithm is addressed by PCA, which improves feature examination of ‎the dataset at a low level for further processes. Additionally, it examines various nature-inspired algorithms that can be applied to ‎clustering tasks, demonstrating their efficacy in optimizing and refining clustering results. Overall, the study presents a ‎framework combining KNN+PCA+EGWO that combines these elements to achieve superior clustering performance, i.e., ‎increasing the result performance by 45-50 % and the effects of preprocessing steps are also shown with low error and other ‎great parameters present in the paper‎.

  • References

    1. Raman R, Kumar V, Pillai BG, Rabadiya D, Patre S, Meenakshi R. The impact of enhancing the k-means algorithm through genetic algorithm op-timization on high-dimensional data clustering outcomes. In: 2024 International Conference on Knowledge Engineering and Communication Sys-tems (ICKECS). IEEE; 2024. p. 1–5. https://doi.org/10.1109/ICKECS61492.2024.10617268.
    2. Zhang X, Lin Q, Mao W, Liu S, Dou Z, Liu G. Hybrid Particle Swarm and Grey Wolf Optimizer and its application to clustering optimization. Appl Soft Comput. 2021;101(107061):107061. Available from: https://doi.org/10.1016/j.asoc.2020.107061.
    3. Song X, Zhang X, Zhao M. Improved artificial bee colony algorithm embedded with differential evolution operator. In: 2024 9th International Conference on Electronic Technology and Information Science (ICETIS). IEEE, 2024. https://doi.org/10.1109/ICETIS61828.2024.10593697.
    4. Guo H, Liu Q, Dang Z. Optimization of a two-dimensional path optimization algorithm based on Dijkstra ant colony optimization algorithm. In: 2024 2nd International Conference on Signal Processing and Intelligent Computing (SPIC) IEEE; 2024. p. 399–402. https://doi.org/10.1109/SPIC62469.2024.10691583.
    5. Yue J, Arimuzha. Big data optimization clustering algorithm for power grid CPS based on PSO. In: 2024 Second International Conference on Data Science and Information System (ICDSIS). IEEE, 2024. https://doi.org/10.1109/ICDSIS61070.2024.10594360.
    6. Suganya D, Sugumar R. PSO-optimized CNN for feature extraction and accurate classification of satellite images using machine learning. In: 2024 International Conference on Computing and Data Science (ICCDS). IEEE; 2024. https://doi.org/10.1109/ICCDS60734.2024.10560453.
    7. Liang Z, Wang Z, Wang Y, Yan X. A study of trajectory planning of medical robots based on particle swarm optimization (PSO) algorithm. In: 2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI). IEEE; 2024. https://doi.org/10.1109/ICETCI61221.2024.10594551.
    8. Ikotun AM, Ezugwu AE, Abualigah L, Abuhaija B, Heming J. K-means clustering algorithms: A comprehensive review, variants analysis, and ad-vances in the era of big data. Inf Sci (Ny)]. 2023;622:178–210. Available from: https://doi.org/10.1016/j.ins.2022.11.139.
    9. Wang C, Yang N, Xu W, Wang J, Sun J, Chen X. Research on a text data preprocessing method suitable for clustering algorithm. In: 2022 3rd In-ternational Conference on Information Science, Parallel and Distributed Systems (ISPDS). IEEE; 2022. https://doi.org/10.1109/ISPDS56360.2022.9874172.
    10. Shial G, Sahoo S, Panigrahi S. An enhanced GWO algorithm with improved explorative search capability for global optimization and data cluster-ing. Appl Artif Intell ]. 2023;37(1). Available from: https://doi.org/10.1080/08839514.2023.2166232.
    11. Rahmani AM, Haider A, Ali S, Mohammadi M, Mehranzadeh A, Khoshvaght P, et al. A routing approach based on combination of gray wolf clus-tering and fuzzy clustering and using multi-criteria decision making approaches for WSN-IoT. Comput Electr Eng ]. 2025;122(109946):109946. Available from: https://doi.org/10.1016/j.compeleceng.2024.109946.
    12. Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw]. 2014;69:46–61. Available from: https://doi.org/10.1016/j.advengsoft.2013.12.007.
    13. Rashaideh H, Sawaie A, Al-Betar MA, Abualigah LM, Al-laham MM, Al-Khatib RM, et al. A grey wolf optimizer for text document clustering. J Intell Syst ]. 2019;29(1):814–30. Available from: https://doi.org/10.1515/jisys-2018-0194.
    14. Yang Y, Zhang X, Li B, Qin K. A grey wolf optimizer-based topology shaping method for UAV swarm. In: 2022 IEEE 5th International Confer-ence on Electronics Technology (ICET). IEEE; 2022. https://doi.org/10.1109/ICET55676.2022.9824250.
    15. Ahmadi R, Ekbatanifard G, Bayat P. A modified grey wolf optimizer based data clustering algorithm. Appl Artif Intell ]. 2021;35(1):63–79. Avail-able from: https://doi.org/10.1080/08839514.2020.1842109.
    16. Kihel BK, Chouraqui S. A Novel Genetic Grey Wolf optimizer for Global optimization and Feature Selection. In: 2020 Second International Con-ference on Embedded & Distributed Systems (EDiS). IEEE; 2020. https://doi.org/10.1109/EDiS49545.2020.9296449.
    17. Millah IS, Chang PC, Teshome DF, Subroto RK, Lian KL, Lin J-F. An enhanced grey wolf optimization algorithm for photovoltaic maximum power point tracking control under partial shading conditions. IEEE Open J Ind Electron Soc ]. 2022;3:392–408. Available from: https://doi.org/10.1109/OJIES.2022.3179284.
    18. Jiang K, Ni H, Sun P, Han R. An improved binary grey wolf optimizer for dependent task scheduling in edge computing. In: 2019 21st Interna-tional Conference on Advanced Communication Technology (ICACT). IEEE; 2019. https://doi.org/10.23919/ICACT.2019.8702018.
    19. Xu H, Liu X, Su J. An improved grey wolf optimizer algorithm integrated with Cuckoo Search. In: 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). IEEE; 2017. https://doi.org/10.1109/IDAACS.2017.8095129.
    20. Wang W, Ye L, Zhang Y, Li Y. Manual bidirectional dislocation flip chip alignment technology based on image preprocessing. In: 2024 25th Inter-national Conference on Electronic Packaging Technology (ICEPT). IEEE; 2024. p. 1–5. https://doi.org/10.1109/ICEPT63120.2024.10668765.
    21. Kyriaki K, Koukopoulos D, Fidas CA. A comprehensive survey of EEG preprocessing methods for cognitive load assessment. IEEE Access 2024;12:23466–89. Available from: https://doi.org/10.1109/ACCESS.2024.3360328.
    22. Murti DMP, Pujianto U, Wibawa AP, Akbar MI. K-nearest neighbor (K-NN) based missing data imputation. In: 2019 5th International Conference on Science in Information Technology (ICSITech). IEEE; 2019. https://doi.org/10.1109/ICSITech46713.2019.8987530.
    23. Salem N, Hussein S. Data dimensional reduction and principal omponents analysis. Procedia Computer Science. 2019;163:292–9. https://doi.org/10.1016/j.procs.2019.12.111.
    24. Nanni L, Ghidoni S, Brahnam S. Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recognit 2017;71:158–72. Available from: https://doi.org/10.1016/j.patcog.2017.05.025.
    25. UCI machine learning repository Uci.edu. [cited 2025 Apr 15]. Available from: https://archive.ics.uci.edu/.
    26. Nainggolan R, Perangin-angin R, Simarmata E, Tarigan AF. Improved the performance of the K-means cluster using the Sum of Squared Error (SSE) optimized by using the elbow method. J Phys Conf Ser. 2019;1361(1):012015. Available from: https://doi.org/10.1088/1742-6596/1361/1/012015.
    27. Binu Jose A, Das P. A multi-objective approach for inter-cluster and intra-cluster distance analysis for numeric data. In: Lecture Notes in Networks and Systems. Singapore; Singapore: Springer; 2022. p. 319–32. https://doi.org/10.1007/978-981-19-0707-4_30.
    28. Li C, Wu T. A boosted clustering algorithm for distributed homogeneous data mining. In: 2006 6th World Congress on Intelligent . control and au-tomation 2006.
  • Downloads

  • How to Cite

    Yadav, S. S., & Sutaria, K. . (2025). PCA-Based Model to Enhance The Performance of Data‎Clustering Using A Metaheuristic Algorithm. International Journal of Basic and Applied Sciences, 14(5), 282-288. https://doi.org/10.14419/gcjafw37