Multi-label Classification: a survey

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    Wide use of internet generates huge data which needs proper organization leading to text categorization. Earlier it was found that a document describes one category. Soon it was realized that it can describe multiple categories simultaneously. This scenario reveals the use of multi-label classification, a supervised learning approach, which assigns a predefined set of labels to an object by looking at its characteristics. Earlier used in text categorization, but soon it became the choice of researchers for wide applications like marketing, multimedia annotation, bioinformatics. Two most common approaches for multi-label classification are transformation which takes the benefit of existing single label classifiers preceded by converting multi-label data to single label, or an adaptation which designs classifiers which handle multi-label data directly. Another popular approach is ensemble of multiple classifiers taking votes of all. Other approaches are also available namely algorithm independent and algorithm dependent approach. Based on results produced, suitable metric is used for example or label wise evaluation which depends on whether prediction is binary or ranking. Every approach offers benefits and issues like loss of label dependency in transformation, complexity in case of adaptation, improvement in results using ensemble which should be considered during design of underlying application.



  • Keywords

    classification; machine learning; multi-label; supervised

  • References

      [1] Yannis Papanikolaou et al, Large-Scale Online Semantic Indexing of Biomedical Articles via an Ensemble of Multi-Label Classification Models, Journal of Biomedical Semantics 2017 8:43

      [2] Rafal Rak et al, Multi-label Associative Classification of Medical Documents from MEDLINE, Proc. of the Fourth International Conf. on Machine Learning and Applications (ICMLA’05), 2005 IEEE

      [3] Zheng-Jun Zha et al, Joint Multi-Label Multi-Instance Learning for Image Classification, 978-1-4244-2243-2/08/$25.00 ©2008 IEEE

      [4] Qinghua Yu et al, Combining local and global hypotheses in deep neural network for multilabel image classification, Neurocomputing 235 (2017) 38–45

      [5] Kong, X. et al, Multi-label feature selection for graph classification. In Data Mining (ICDM), 2010 IEEE 10th Int. Conf. (pp. 274-283)

      [6] Ricardo Cerri et al, Comparing methods for multi-label classification of proteins using machine learning techniques, Springer 2009

      [7] M. L. Zhang, Z. H. Zhou, Multi-label neural networks with applications to functional genomics and text categorization, IEEE Transactions on Knowledge and Data Engineering 18(10)(2006)1338–1351

      [8] Trohidis, K. et al, 2008, September. Multi-Label Classification of Music into Emotions. In ISMIR (Vol. 8, pp. 325-330)

      [9] Z. H. Zhou, M. L. Zhang, Multi-Instance Multi-Label Learning with Application to Scene Classification, Advances in Neural Information Processing Systems, 2006, pp. 1609-1616

      [10] Xin Chen et al, Mining Social Media Data for Understanding Students’ Learning Experiences, IEEE Transactions On Learning Technologies, Vol. 7, No. 3, July-September 2014

      [11] Eneldo Loza Mencia, Multilabel Classification in Parallel Tasks, 2nd Int. Workshop on Learning from Multi-Label Data, Israel, 2010

      [12] Alberto F. De Souza et al, Automated multi-label text categorization with VG-RAM weightless neural networks, Neurocomputing 72 (2009) 2209–2217

      [13] Zhiqiang Zeng et al, Multimedia annotation via semi-supervised shared-subspace feature Selection, Journal of Visual Communication and Image Representation, Volume 48, October 2017, Pages 386-395

      [14] G. Tsoumakas and I. Katakis, Multi-label classification: An overview, International Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007

      [15] Tsoumakas G., Zhang M.L., and Zhou Z.H., Tutorial on learning from multi-label data, in ECML PKDD, Bled, Slovenia, 2009 [Online].Available:

      [16] A. de Carvalho and A. A. Freitas, “A tutorial on multi-label classification techniques,” in Studies in Computational Intelligence 205, Berlin, Germany: Springer, 2009, pp. 177–195

      [17] Tsoumakas G., et al., Mining multilabel data, Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach, Eds. Berlin, Germany: Springer, 2010, pp. 667-686

      [18] G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Džeroski, An extensive experimental comparison of methods for multi-label learning, Pattern Recognit., vol. 45, no. 9, pp. 3084–3104, 2012

      [19] Zhang M.L. and Zhou Z.H., A review on multi-label learning algorithms, IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 8, pp. 1819-1837, 2014

      [20] V. S. Tidake, S. S. Sane, Evaluation of Multi-label classifiers in various domains using decision tree, Springer Nature 2018, Intelligent Computing and Information and Communication, Advances in Intelligent Systems and Computing 673, pp.117-127

      [21] Tsoumakas, Grigorios, and Ioannis Vlahavas, Random k-labelsets: An ensemble method for multilabel classification, Machine learning: ECML 2007. Springer Berlin Heidelberg, 2007, 406-417

      [22] Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classier chains for multi-label classification. In: Proc. of European Conf. on Machine Learning and Knowledge Discovery in Databases: Part II. ECML PKDD '09, Berlin, Heidelberg, Springer-Verlag (2009) 254-269

      [23] G. Tsoumakas and I. Katakis, Effective and efficient multilabel classification in domains with large number of labels, in Proc. Work. Notes ECML PKDD Workshop MMD, Antwerp, Belgium, 2008

      [24] M.L.Zhang, Z.H. Zhou, ML-KNN: A lazy learning approach to multilabel learning, Pattern Recognit., vol.40, no.7, pp.2038–2048, 2007

      [25] Schapire R.E., Singer Y., Boostexter: a boosting-based system for text categorization, Machine Learning 39 (2000) 135-168

      [26] A. Clare, R.D. King, Knowledge discovery in multi-label phenotype data, in: Proc. of 5th European Conf. on PKDD, 2001, pp. 42–53

      [27] Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA

      [28] M. L. Zhang et al, Feature selection for multi-label naive Bayes classification, Information Sciences 179 (2009) 3218–3229

      [29] J. Read, A pruned problem transformation method for multi-label classification, in: Proceedings of the New Zealand Computer Science Research Student Conference, 2008, pp. 143–150

      [30] J. Read et al, Multi-label classification using ensembles of pruned sets, Proc. of 8th IEEE Int. Conf. on Data Mining, 2008, pp.995–1000

      [31] S. Godbole, S. Sarawagi, Discriminative methods for multi-labeled classification, in: Advances in Knowledge Discovery and Data Mining, Springer, Berlin/ Heidelberg, 2004, pp. 22–30.

      [32] J. Arunadevi et al, An evolutionary multi-label classification using associative rule mining for spatial preferences, IJCA Special Issue on “Artificial Intelligence Techniques - Novel Approaches & Practical Applications” AIT, 2011

      [33] Ravi Patel, Jay Vala, Kanu Patel , Classification on multi-label dataset using rule mining technique, IJRET: International Journal of Research in Engineering and Technology Vol. 03 Issue: 06, Jun-2014

      [34] Raed Alazaidah et al, A multi-label classification approach based on correlations among labels, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 6, No. 2, 2015

      [35] H Haripriya et al, Multi-label prediction using association rule generation and simple k-means, 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT), IEEE

      [36] F. Sebastiani, Machine learning in automated text categorization, ACM Compu. Surv. 34 (1) (2002) 1-47

      [37] E. Spyromitros-Xioufis, G. Tsoumakas, and I. Vlahavas, An empirical study of lazy multilabel classification algorithms, in Proc. 5th Hellenic Conf. Artif. Intell., Syros, Greece, 2008, pp. 401–406

      [38] C. Vens et al, Decision trees for hierarchical multi-label classification, Mach. Learn., vol. 73, no. 2, pp. 185–214, 2008

      [39] K. Dembczy´nski et al, Bayes optimal multilabel classification via probabilistic classifier chains, in Proc. 27th Int. Conf. Mach. Learn., Haifa, Israel, 2010, pp. 279–286

      [40] Liang Sun et al, Hypergraph spectral learning for multi-label classification, KDD’08, August 24–27, ACM 2008

      [41] Hung-Yi Lo et al, Generalized k-Labelsets Ensemble for Multi-Label and Cost-Sensitive Classification, IEEE Transactions On Knowledge And Data Engineering, Vol. 26, No. 7, pp1679{ 1691, JULY 2014

      [42] J. Jiang, S. Tsai, and S. Lee, FSKNN: multi-label text categorization based on fuzzy similarity and k nearest neighbors, Expert Syst. Appl., vol. 39, no. 3, pp. 2813-2821, 2012

      [43] S. Lee et al, Multilabel text categorization based on fuzzy relevance clustering, IEEE T. Fuzzy Systems, vol.22, no.6, pp.1457-1471, 2014

      [44] Rubiya P U et al, A fuzzy based approach for multilabel text categorization and similar document retrieval, IJARCSSE, Volume 5, Issue 9, September 2015 ISSN: 2277 128X

      [45] Gao, Sheng et al, A MFoM learning approach to robust multiclass multi-label text categorization. In Proc. of the 21st international conf. on Machine learning, p. 42. ACM, 2004

      [46] Zhang, Yin, and Zhi-Hua Zhou. Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD) 4, no. 3 (2010): 14

      [47] Ji, S., Tang et al, 2008, August. Extracting shared subspace for multi-label classification. In Proc. of 14th ACM SIGKDD international conf. on Knowledge discovery and data mining (pp. 381-389)

      [48] J. Han, M. Kamber, Data Mining: Concepts and Techniques, The Morgan Kaufmann Series in Data Management Systems

      [49] Doquire G., Verleysen M. (2011) Feature Selection for Multi-label Classification Problems. In: Advances in Computational Intelligence. IWANN 2011. Lecture Notes in Computer Science, vol 6691. Springer, Berlin, Heidelberg

      [50] Li, S., Zhang, Z., Duan, J. (2014). An ensemble multi-label feature selection algorithm based on information entropy. Int. Arab J. Inf.

      Technol., 11(4), 379-386

      [51] Li, L. et al, December. Multi-label feature selection via information gain. In International Conference on Advanced Data Mining and Applications (pp. 345-355), Springer International Publishing, 2014

      [52] Jungjit et al, A new genetic algorithm for multi-label correlation-based feature selection, ESANN 2015 proc., Computational Intelligence and Machine Learning. Bruges (Belgium), 22-24 April 2015

      [53] Zhang, M.L. and Wu, L., 2015. LIFT: Multi-label learning with label specific features. IEEE transactions on pattern analysis and machine intelligence, 37(1), pp.107-120

      [54] K. Kira, L. Rendell, A practical approach to feature selection, Machine Learning Proceedings 1992, Pages 249–256

      [55] Newton Spolaˆor et al, Relief for multi-label classification, 2013 Brazilian Conference on Intelligent Systems IEEE

      [56] Newton Spolaˆor et al, A comparison of multi-label feature selection methods using the problem transformation approach, Electronic Notes in Theoretical Computer Science 292 (2013) 135–151

      [57] Zhang, M.L. and Zhang, K., 2010, July. Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 999-1008). ACM

      [58] S.-J. Huang, Y. Yu, and Z.-H. Zhou, Multi-label hypothesis reuse, in Proc. 18th ACM SIGKDD Conf. KDD, Beijing, China, 2012, pp. 525–533

      [59] Huang, S.J. and Zhou, Z.H., 2012, July. Multi-label learning by exploiting label correlations locally. In Twenty-Sixth AAAI Conference on Artificial Intelligence

      [60] Ying Yu et al, Multi-label classification by exploiting label correlations, Expert Systems with Applications 41 (2014) 2989–3004

      [61] Gauthier Doquire, Michel Verleysen, Mutual information based feature selection for multilabel classification, Neurocomputing 122(2013)148–155

      [62] A. K. Jain, M. N. Murty, and P. J. Flynn, Data clustering: A review, ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, 1999

      [63] G. Tsoumakas, Clustering based multilabel classification for image annotation and retrieval, Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics

      [64] Pranav Gupta, Ashish Anand, Multilabel classification using label clustering, Appearing in Proceedings of the 1st Indian Workshop on Machine Learning, IIT Kanpur, India, 2013

      [65] Zhilou Yu et al, An improved classifier chain algorithm for multilabel classification of big data analysis, HPCC, 2015 IEEE

      [66] G.A. Kaminka et al, A scalable clustering-based local multilabel classification method, ECAI 2016

      [67] Rosane M. M. et al, Multilabel OCS with genetic algorithm for rule discovery, GECCO '09 Proc. of 11th Annual conference on Genetic and evolutionary computation, pp. 1323-1330, 2009, ACM

      [68] Eduardo Corrêa Gonçalves et al, Genetic algorithm for optimizing label ordering in multilabel classifier chains, 2013

      [69] Jungjit S. et al, Two extensions to multilabel correlation-based feature selection: a case study in bioinformatics, Systems, Man, and Cybernetics (SMC), 2013 IEEE

      [70] Quinlan, J. R. (1996). Bagging, boosting, and C4.5. In Proceedings of Thirteenth National Conf. on Artificial Intelligence, pp.725–730

      [71] G. Tsoumakas et al, Random k-labelsets: An ensemble method for multilabel classification, in Proc. of 8th European Conf. on Machine Learning (ECML 2007), Warsaw, Poland, Sept.17-21, pp. 406–417

      [72] M. R. Boutell, J. Luo, X. Shen, C. M. Brown, Learning multi-label scene classification, Pattern Recognition 37(9) (2004)1757–1771

      [73] Read, Jesse, and Peter Reutemann. MEKA: a multi-label extension to WEKA. URL (2012)

      [74] G. Tsoumakas et al, MULAN: A Java library for multi-label learning, J. Mach. Learn. Res., vol. 12, pp. 2411-2414, Jul. 2011

      [75] M. Hall et al., The WEKA data mining software: An update, SIGKDD Explor., vol. 11, no. 1, pp. 10{18, 2009.

      [76] C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, Article 27, 2011 [Online]. Available:

      [77] Tidake, Vaishali S., and Shirish S. Sane. Multi-label Learning with MEKA, CSI Communications August 2016

      [78] Johannes F¨urnkranz, Multilabel Classification via Calibrated Label Ranking, Mach Learn (2008) 73: 133–153

      [79] S. S. Sane et al, An Effective Multilabel classification using Feature Selection, Springer Nature 2018, Intelligent Computing and Info. and Comm., Adv. in Intelligent Sys. and Computing 673, pp. 129-142




Article ID: 28284
DOI: 10.14419/ijet.v7i4.19.28284

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.