Comparison of Algorithms in Authorship Identification using Bengali Poems

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    Author identification of Bengali poems is a paper mainly focusing on identification of an author of a poem. We train the system using a dataset consisting of features extracted from poems by various authors. Features like count of characters, words, spaces, vowels and consonants of Bengali poems are considered. Many training algorithms can be used to identify the authors. Some of the algorithms are J48, SVM, PCA, RDM, Random Forest Tree, Logic Regression, Naive Bayes etc. Every algorithm has its own advantages and disadvantages. The training algorithm used the most is J48 decision tree. It has additional features such as accounting for missing values, decision trees pruning, continuous attribute value ranges, derivation of rules, etc. which will be helpful when we want to classify with larger datasets.

     


  • Keywords


    Authorship identification; Bengali poems; J48 decision tree.

  • References


      [1] Authorship Attribution in Bengali Language, Shanta Phani, Shibamouli Lahiri, Arindam Biswas, ltrc:iiit:ac:in=icon2015 =icon2015proceedings=PDF=37r p:pd f

      [2] Automated Analysis of Bangla Poetry for Classification and Poet Identification using SVM classifier ,2015, Geetanjali Rakshit, Anupam Ghosh ,Pushpak Bhattacharyya, Gholamreza Haffari.

      [3] Authorship Analysis and Identification Techniques: A Review , International Journal of Computer Applications (0975 – 8887), Mubin Shaukat Tamboli , Rajesh S. Prasad, Ph.D, Volume 77 – No.16, September 2013

      [4] Author Identification in Bengali Literary Works using probabilistic classification method. , S.O. Kuznetsov et al. (Eds.): PReMI 2011, LNCS 6744, pp. 220–226, 2011. Suprabhat Das and Pabitra Mitra ,Department of Computer Science and Engineering

      [5] AUTHORSHIP ATTRIBUTION IN TAMIL CLASSICAL POEM (AGANANOORU): A MATHE-MATICAL MODEL, Dr.A.Pandian ,V.V.Ramalingam and R.P.Vishnu Preet , 2016.

      [6] IDENTIFICATION OF AUTHORSHIP IN TAMIL CLASSICAL POEM (PARIPADAL) USING J48 ALGORITHM Dr.A.Pandian ,V.V.Ramalingam and R.P.Vishnu Preet , 2016.

      [7] Author Identification based on Word Distribution in Word Space, 978-1-4799-8792-4/15/$31.00 c 2015 IEEE Barathi Ganesh H B*, Reshma U* and Anand Kumar M.

      [8] Multi-Lingual Author Identification and Linguistic Feature Extraction — a Machine Learning Ap-proach, 978-1-4799-1535-4/13/$31 c 2013 IEEE , Hassan Alam, Aman Kumar.

      [9] Author Identification for Digitized Paintings Collections, 978-1-4673-6143-9/13/$31.00 c 2013 IEEE, Razvan Condorovici, Corneliu Florea and Constantin Vertan

      [10] Author Identification by Automatic Learning, 2015 13th International Conference on Document Anal-ysis and Recognition (ICDAR), 978-1-4799-1805-8/15/$31.00 c 2015 IEEE,Jordan Frery, Christine Largeron Laboratoire Hubert Curien

      [11] Authorship Identification and Author Fuzzy ”Fingerprints”, 978-1-61284-968-3/11/$26.00 c 2011 IEEE, Nuno Homem , Joao Paulo Carvalho

      [12] Author Identification using Sequential Minimal Optimization, 978-1-5090-2246-5/16/$31.00 c 2016 IEEE, John Jenkins, William Nick, Kaushik Roy, Albert Esterline, Joel Bloch

      [13] Towards Author Identification of Arabic Text Articles, 2014 5th International Conference on Informa-tion and Communication Systems (ICICS), Ahmed Fawzi Otoom, Emad E. Abdullah, Shifaa Jaafer, Aseel Hamdallh, Dana Amez

      [14] Author identification in Albanian language, 2011 International Conference on Network-Based Infor-mation Systems, Hakik PACI, Elinda Kajo, Evis Trandafili, Igli TAFA, Denisa Salillari.


 

View

Download

Article ID: 21989
 
DOI: 10.14419/ijet.v7i4.19.21989




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.