Feature selection using ant lion optimization algorithm in text categorization

  • Authors

    • B. Sunil Srinivas Farah Institute of Technology,Affilated to JNTUH
    • A. Govardhan JNTU College of Engineering
    2019-12-15
    https://doi.org/10.14419/ijet.v8i4.10898
  • Antlion Optimization Algorithm, Classification, Dimensionality Reduction, Feature Selection Text Categorization, Support Vector Machine
  • This is Big Data decade with extensive increase in the textual information where the text classification is the significant approach for processing and organizing textual information. Text categorization refers to the process of spontaneously allotting documents to the relevant classes. The key features of these text classification issue is tremendous increase in higher dimensionality of text information. Meta-Heuristics Approaches are effortlessly employed to obtain optimal solutions for high dimensional datasets in text categorization. However, some of these approaches like genetic algorithm and particle swarm optimization gives a sub-optimal solutions, the convergence time is more compared to other approaches and cannot guarantee the global maxima to the text categorization. Thus, in this paper, a nature-inspired optimization approach depending on catching mechanism of antlions in the environment known as Ant Lion Optimizer (ALO) Approach, is applied to resolve higher dimensionality issues prior to text classification. The precision and recall values for the proposed is comparatively effective when compared with the existing text categorization dimensionality reduction techniques.

     

     

  • References

    1. [1] M. H. Aghdam, N. Ghasem-Aghaee, and M. E. Basiri, “Application of ant colony optimization for feature selection in text categorization,†in Proceedings of the IEEE Congress on Evolutionary Computation (CEC ’08), pp. 2867–2873, IEEE Press,HongKong, June 2008.https://doi.org/10.1109/CEC.2008.4631182.

      [2] T. Jo, “Normalized Table Matching Algorithm as Approach to Text Categorizationâ€, pp839-849, Soft Computing, Vol 19, No 4, 2015.https://doi.org/10.1007/s00500-014-1411-9.

      [3] Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. 14th Int. Conf. on Machine Learning ICML-97. (1997) 412–420.

      [4] Caropreso, M., Matwin, S., Sebastiani, F.: A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. In: Text Databases and Document Management: Theory and Practice. Idea Group Publishing, Hershey, PA (2001) 78–102

      [5] Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proc. 22nd Int. ACM SIGIR Conf. on R. & D. in Information Retrieval. (1999) 42–49.https://doi.org/10.1145/312624.312647.

      [6] Mladeni´c, D.: Feature subset selection using in text learning. In: 10th European Conference on Machine Learning. (1998) 95–100https://doi.org/10.1007/BFb0026677.

      [7] Z. Zhen, X. Zeng, H. Wang, and L. Han, “A global evaluation criterion for feature selection in text categorization using Kullback-Leibler divergence,†in Proceedings of the International Conference of Soft Computing and Pattern Recognition (SoCPaR ’11), pp. 440–445, October 2011.https://doi.org/10.1109/SoCPaR.2011.6089284.

      [8] Ramakrishna Murty, M., Murthy, J., Prasad Reddy, P., Satapathy, S.: A survey of cross domain text categorization techniques. In: Recent Advances in Information Technology (RAIT), 2012 1st International Conference on, IEEE (2012) 499–504https://doi.org/10.1109/RAIT.2012.6194629.

      [9] Nguyen, C.T.: Bridging semantic gaps in information retrieval: Context-based approaches. ACM VLDB 10 (2010).

      [10] Rafi, M., Hassan, S., Shaikh, M.S.: Content-based text categorization using wikitology. CoRR abs/1208.3623 (2012)

      [11] Forman, G.: An experimental study of feature selection metrics for text categorization. Journal of Machine Learning Research 3 (2003) 1289–1305.

      [12] K. Waqas, R. Baig, and S. Ali, “Feature subset selection using multi-objective genetic algorithms,†in Proceedings of the 13th IEEE International Multitopic Conference (INMIC ’09), pp. 1–6, December 2009.https://doi.org/10.1109/INMIC.2009.5383159.

      [13] A. AlSukker, R. N. Khushaba, and A. Al-Ani, “Enhancing the diversity of genetic algorithm for improved feature selection,†in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC ’10), pp. 1325–1331, October 2010.https://doi.org/10.1109/ICSMC.2010.5642445.

      [14] M. Mahrooghy, H. Y. Nicolas, G. A. Valentine, A. James, and Y. Shantia, “On the use of the genetic algorithmfilter-based feature selection technique for satellite precipitation estimation,†IEEE Geoscience and Remote Sensing Letters, vol. 9, no. 5, pp. 963–967, 2012.https://doi.org/10.1109/LGRS.2012.2187513.

      [15] Hao Chen, Wen Jiang, Canbing Li, and Rui Li, “A Heuristic Feature Selection Approach for Text Categorization by Using Chaos Optimization and Genetic Algorithmâ€, Hindawi Publishing Corporation, Mathematical Problems in Engineering, Volume 2013, Article ID 524017, 6 pages.https://doi.org/10.1155/2013/524017.

  • Downloads

  • How to Cite

    Sunil Srinivas, B., & Govardhan, A. (2019). Feature selection using ant lion optimization algorithm in text categorization. International Journal of Engineering & Technology, 8(4), 582-589. https://doi.org/10.14419/ijet.v8i4.10898