A hybrid model of ordinal ranking-based clustering using G+Rank K-Means

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    K-Means is a clustering technique that maps object features onto multidimensional coordinates and groups them based on location closeness. However, measuring closest distance can be doubtful when ranking representation of ordinal scale objects are not considered. Thus, distribution of objects in a cluster may violate ranking representation. For example, a same-rank object may be grouped into different clusters. To address this issue, an enhanced of K-Means algorithm is proposed to achieve better and meaningful result of ranking-based clustering. It is based on integration of ranking algorithm that sort objects into ranking list which also representing object closeness based on its nearby location. A new additional step in K-Means is proposed in reassigning unaligned K-Means nearest objects using ranking attribute that eventually accelerates the clustering process. AHP ranking algorithm is integrated into K-Means in achieving a ranking-based cluster. This enhancement was evaluated on three ordinal datasets covering 67 Java programs, 92 students’ marks on computer architecture subject and 456 EUFA’s football club coefficient ranking list. The results show that by integrating ranking algorithm in K-Means as proposed in G+Rank K-Means, a rank cluster representation has been successfully achieved. The purity value that represents the correctness against certain group classification has also increased.

     

     


  • Keywords


    K-Means, Ranking-Based Clustering

  • References


      [1] Wang, R., Chen, J., Yu, P. S. & Wu, B. Ranking-based Clustering on General Heterogeneous Information Networks by Network Projection. in Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management 1, 699–708 (ACM Press, 2014).

      [2] Rad, A., Naderi, B. & Soltani, M. Clustering and ranking university majors using data mining and AHP algorithms: A case study in Iran. Expert Systems with Applications 38, 755–763 (2011).

      [3] Jain, A. K. Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31, 651–666 (2010).

      [4] Singh, A., Yadav, a. & Rana, A. K-means with Three different Distance Metrics. International Journal of Computer Applications 67, 13–17 (2013).

      [5] Azdnia, A. H., Ghadimi, P. & Aghdam, M. M. A Hybrid Model of Data Mining and MCDM Methods for Estimating Customer Lifetime Value. in Proceedings of the 41st International Conference on Computers and Industrial Engineering 80–85 (2011).

      [6] Kumar, K. & Kumanan, S. Decision Making in Location Selection: An Integrated Approach with Clustering and TOPSIS. The IUP Journal of Operations Management 11, 7–20 (2012).

      [7] Dong, B., Gao, P., Wang, H. & Liao, S. Clustering Human Wrist Pulse Signals via Multiple Criteria Decision Making. in Proceedings of the 26th International Conference on Tools with Artificial Intelligence (ed. IEEE) 243–250 (2014). doi:10.1109/ICTAI.2014.44

      [8] Bai, C., Dhavale, D. & Sarkis, J. Expert Systems with Applications Integrating Fuzzy C-Means and TOPSIS for performance evaluation : An application and comparative analysis. Expert Systems With Applications 41, 4186–4196 (2014).

      [9] Chormunge, S. & Jena, S. Evaluation of Clustering Algorithms for High Dimensional Data Based on Distance Functions. in Proceedings of the 2014 International Conference on Information and Communication Technology for Competitive Strategies Article No. 78 (ACM Press, 2014).

      [10] Poomagal, S. & Hamsapriya, T. Optimized k-means clustering with intelligent initial centroid selection for web search using URL and tag contents. in Proceedings of the International Conference on Web Intelligence, Mining and Semantics - WIMS ’11 Article No. 65 (ACM Press, 2011). doi:10.1145/1988688.1988764

      [11] Erisoglu, M., Calis, N. & Sakallioglu, S. A new algorithm for initial cluster centers in k-means algorithm. Pattern Recognition Letters 32, 1701–1705 (2011).

      [12] Sun, Y. et al. RankClus : Integrating Clustering with Ranking for Heterogeneous Information Network Analysis. in Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology 565–576 (ACM Press, 2009).

      [13] Pei, J., Tseng, V. S., Cao, L., Motoda, H. & Xu, G. in Advances in Knowledge Discovery and Data Mining 583–594 (Springer Berlin Heidelberg, 2013).

      [14] Miyamoto, S., Yamazaki, M. & Hashimoto, W. Fuzzy Semi-supervised Clustering with Target Clusters Using Different Additional Terms. in Proceedings of the International Conference on Granular Computing, 2009, GRC ’09 1, 444–449 (IEEE, 2009).

      [15] Al-Harbi, S. H. & Rayward-Smith, V. J. Adapting k-means for supervised clustering. Applied Intelligence 24, 219–226 (2006).

      [16] Suhailan S. et al. Targeted Ranking-Based Clustering Using AHP K-Means. International Journal of Advance Soft Computing and its Application 7(3), 100-113 (2015)

      [17] Suhailan, S. (2017). Dataset A. [online] figshare. Available at: https://doi.org/10.6084/m9.figshare.5216368 [Accessed 18 Jul. 2017].

      [18] Suhailan, S. (2017). Dataset B. [online] figshare. Available at: https://doi.org/10.6084/m9.figshare.5216377 [Accessed 18 Jul. 2017].

      [19] Suhailan, S. (2017). Dataset C. [online] figshare. Available at: https://doi.org/10.6084/m9.figshare.5216371 [Accessed 18 Jul. 2017].


 

View

Download

Article ID: 11209
 
DOI: 10.14419/ijet.v7i2.15.11209




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.