A Comparative analysis of machine learning algorithms applied to multi lingual texts summarization

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    Over the scarce period the World Wide Web (WWW) takes prolonged extremely and huge volumes of information in the form of news articles is available online. Many a times individuals don’t take the spell besides tolerance towards recite whole news divisions or ample long articles. At this time ascends the essential of computerized texts summarization. Uncertainty an instant of the real fillings of the broadcast object is obtainable formerly it will convert calmer for the handler to get a gist of the article as well as it would save a lot of his time. Nearby, numerous methods towards texts summarization which could be off the record on the root of numerous factors such as level of processing, kind of information being processed, etc. The work proposed in this paper tries to integrate these approaches with modern computational linguistics, semantic technologies and machine learning algorithms to devise a novel technique for multi lingual text summarization which could produce summaries aimed at sole too as group of forms. The anticipated method specifically addresses two major languages for the study, one is English being the language used worldwide and second Hindi being the national language  of India. The machine learning techniques used for extraction are neural networks and fuzzy logic systems. Finally, a comparison of these techniques is done to show that fuzzy logic systems give better precision as compared to neural networks for summarization in both the languages. The average difference in precision is around 8-10% for Hindi and around 45-50% for English text documents.

     

     


  • Keywords


    multi lingual text summarization, computational linguistics, machine learning techniques, semantic technologies

  • References


      [1] Junlin Zhanq ,Le Sun, Quan Zhou, A Cue-based Hub-Authority Approach for Multi Document Summarization, Proceeding of NLP-KE'05

      [2]Yan-Min Chen, Xiao-Long Wang,Bing-Quan , Multi-document Summarization based on Lexical Chains, Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18- 21 August 2005

      [3]Cem Aksoy, Ahmet Bugdayci, Tunay Gur, Ibrahim Uysal, Fazli Can, Semantic Argument Frequency-Based Multi-Document Summarization, ISCIS, September 14-16, 2009, METU, Northern Cyprus Campus.

      [4] Liang Ma, Tingting He, Fang Li, Zhuomin Gui, Jinguang Chen, Query-focused Multi-document Summarization Using Keyword Extraction, 2008 International Conference on Computer Science and Software Engineering.

      [5] Yong Liu, Xiao lei Wang, Jin Zhang, Hongbo Xu, Personalized PageRank based Multi-document Summarization, IEEE International Workshop on Semantic Computing and Systems, 2008 IEEE.

      [6] Sun Park, ByungRae Cha, Query-based Multi-document Summarization using Non-negative Semantic Feature and NMF Clustering, Fourth International Conference on Networked Computing and Advanced Information Management, 2008 IEEE.

      [7]Hongling Wang Guodong Zhou, Topic-driven Multi-Document Summarization, 2010 International Conference on Asian Language Processing

      [8] Harsha Dave and Shree Jaswal, Graph Based Technique for Hindi Text Summarization, 2015 1st International Conference on Next Generation Computing Technologies (NGCT-2015),Dehradun, India, 4-5 September 2015.

      [9] Lei Yu, Fuji Ren, A Study on Cross-Language Text Summarization Using Supervised Methods.

      [10] Gael de Chalendar, Romaric Besan, Olivier Ferret , Gregory Grefenstette and Olivier Mesnard, “Crosslingual summarization with thematic extraction, syntactic sentence simplification, and bilingual generation”, CEA-LIST LIC2M, BP6 F92265 Fontenay-aux-Roses France.

      [11] Ha Nguyen Thi Thu, Quynh Nguyen Huu, Tu Nguyen Thi Ngoc, A Supervised Learning Method Combine with Dimensionality Reduction in Vietnamese Text Summarization, ©2013 IEEE.

      [12] Jayashree R, Srikanta Murthy K , Basavaraj.S.Anami , Suitability of Artificial Neural Network to Text Document Summarization in the Indian Language-Kannada, International Journal of Computer Information Systems and Industrial Management Applications. ISSN 2150-7988 Volume 6 (2014) pp. 626-634 © MIR Labs.

      [13] Sakshee Vijay, Vartika Rai, “Extractive Text Summarization in Hindi”,2017 International Conference on Asian language Processing.,pp318-321,2017 IEEE.

      [14] Manisha Gupta and Dr.Naresh Kumar, “Text Summarization of Hindi Documents using Rule Based Approach”,2016 International Conference on Micro-Electronics and Telecommunication Engineering.,pp336-370,2016 IEEE.

      [15] K. Vimal Kumar, Divakar Yadav and Arun Sharma,” Graph Based Technique for Hindi Text Summarization”, © Springer India 2015 ,J.K. Mandal et al. (eds.), Information Systems Design and Intelligent Applications, Advances in Intelligent Systems and Computing, pg 339.

      [16] Luhn, H. P. 1958, “The Automatic Creation of Literature Abstracts”, IBM Journal, pp. 159-165.

      [17] Udo Hahn Albert Ludwigs University Inderjeet Mani Mitre Corp.,” The Challenges of Automatic Summarization”, 2000 IEEE, November 2000.

      [14] Ladda Suanmali, Mohammed Salem, Binwahlan,Naomie Salim, “Sentence Features fusion for text summarization using fuzzy logic”,2009 Ninth International Conference on Hybrid Intelligent Systems.,pp142-146,2009 IEEE.

      [17] Jyoti Yadav and Yogesh Kumar Meena,” Use of Fuzzy Logic and WordNet for Improving Performance of Extractive Automatic Text Summarization”,2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI),Sept 21-24,Jaipur,India pp2071-2077, 2016 IEEE .

      [18] S. Santhana Megala, A. Kavitha, A. Marimuthu , “Enriching Text Summarization using Fuzzy Logic”, International Journal of Computer Science and Information Technologies, Volume 5, Issue 1, 2014.


 

View

Download

Article ID: 24568
 
DOI: 10.14419/ijet.v7i3.24.24568




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.