AK-means: an automatic clustering algorithm  based on K-means

Omar Kettani; Faical Ramdani; Benaissa Tadili

doi:10.14419/jacst.v4i2.4749

Article Summary Abstract References Full Article How to cite

Authors
- Omar Kettani Mohamed V- University Rabat
- Faical Ramdani Mohamed V- University Rabat
- Benaissa Tadili Mohamed V- University Rabat
https://doi.org/10.14419/jacst.v4i2.4749

Received date: May 11, 2015

Accepted date: June 8, 2015

Published date: June 13, 2015
Automatic Clustering, G-Means, K-Means, Parameter-Free Clustering.
Abstract

In data mining, K-means is a simple and fast algorithm for solving clustering problems, but it requires that the user provides in advance the exact number of clusters (k), which is often not obvious. Thus, this paper intends to overcome this problem by proposing a parameter-free algorithm for automatic clustering. It is based on successive adequate restarting of K-means algorithm. Experiments conducted on several standard data sets demonstrate that the proposed approach is effective and outperforms the related well known algorithm G-means, in terms of clustering accuracy and estimation of the correct number of clusters.
References
1. [1] Aloise, D.; Deshpande, A.; Hansen, P.; Popat, P. (2009). "NP-hardness of Euclidean sum-of-squares clustering". Machine Learning 75: 245â€“249. http://dx.doi.org/10.1007/s10994-009-5103-0.
  [2] Lloyd. S. P. (1982). "Least squares quantization in PCM". IEEE Transactions on Information Theory 28 (2): 129â€“137. http://dx.doi.org/10.1109/TIT.1982.1056489.
  [3] Greg Hamerly and Charles Elkan. Learning the k in k-means. In Proceedings of the seventeenth annual conference on neural information processing systems (NIPS), pages 281â€“288, 2003
  [4] Asuncion, A. and Newman, D.J. (2007). UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science.
  [5] H. Spath, Clustering Analysis Algorithms for Data Reduction and Classification of Objects, Ellis Horwood, Chichester, 1980.
  [6] Dan Pelleg and Andrew Moore. X-means: Extending k-means with efficient estimation of the number of clusters. In Proceedings of the 17th International Conf. on Machine Learning, pages 727â€“734. Morgan Kaufmann, 2000.
  [7] Robert Tibshirani, Guenther Walther, and Trevor Hastie. Estimating the number of clusters in a dataset via the Gap statistic. Journal of the Royal Statistical Society B, 63:411â€“423, 2001. http://dx.doi.org/10.1111/1467-9868.00293.
  [8] Pal, N.R. and Bezdek, J.C. (1995) On Cluster Validity for the Fuzzy c-Means Model. IEEE Transactions on Fuzzy Systems, 3, 370-379. http://dx.doi.org/10.1109/91.413225.
  [9] Kettani, O.; Tadili, B. and Ramdani, F. - A deterministic k-means algorithm based on nearest neighbor search. International Journal of Computer Applications (0975 â€“ 8887), Vol. 63, No.15, February 2013. http://dx.doi.org/10.5120/10544-5541.
  [10] T. Calinski and J. Harabasz. A dendrite method for cluster analysis. Communications in Statistics, 3:1â€“27, 1974.
  [11] G. W. Milligan and M. C. Cooper. An examination of procedures for determining the number of clusters in a data set. Psychometrica, 50:159â€“179, 1985. http://dx.doi.org/10.1007/BF02294245.
  [12] L. Kaufman and P. J. Rousseeuw. Finding groups in Data: "an Introduction to Cluster Analysis". Wiley, 1990. http://dx.doi.org/10.1002/9780470316801.
  [13] C. Elkan, "Using the triangle inequality to accelerate k-means", ICML 2003 Conference Proceedings, p. 147#153, 2003.
Downloads
Additional Files
How to Cite
Kettani, O., Ramdani, F., & Tadili, B. (2015). AK-means: an automatic clustering algorithm based on K-means. Journal of Advanced Computer Science & Technology, 4(2), 231-236. https://doi.org/10.14419/jacst.v4i2.4749
ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

Endnote/Zotero/Mendeley (RIS)

BibTeX

AK-means: an automatic clustering algorithm based on K-means

Authors

Abstract

References

Downloads

Additional Files

How to Cite