Onto based cluster labelling and incremental system for information retrieval

  • Authors

    • Seifedine Kadry Beirut Arab University
    • Irfan Ahamed Mohammed Saleem Hindusthan College of Engineering and Technology
    • Lakshmana Kumar Ramasamy Nehru Institute of Engineering and Technology
    • N. Kannammal Surya Engineering College
    2019-04-12
    https://doi.org/10.14419/ijet.v7i4.22914
  • Semantic Document Clustering, K-Means, Labels, Synsets.
  • Document clustering is utilised for data retrieval, past task of labels to cluster individuals enhances quick retrieval, and Existing framework doles out labels because of standard terms show in archives. However, semantic marks are first taking into Document semantic relationship, the incremental calculation for the versatile framework. The proposed work allocates onto mark because of scientific categorisation, i.e. ontology-based; word net Synsets and gleams coordinating and incremental dynamicity is accomplished through naming. The assessment is done utilising f-measure and figuring speed, contrasted and benchmark K-Means, K-Means without labels. Thus semantic labelling is designed more efficient than traditional document clustering methodologies and can be implemented for real-time internet document clustering applications.

     

     


  • References

    1. [1] Ponnusamy, R., Degife, W. A., & Alemu, T. (2018). Recommender Frameworks Outline System Design and Strategies: A Review. In Knowledge Computing and its Applications (pp. 261-285). Springer, Singapore. https://doi.org/10.1007/978-981-10-8258-0_12.

      [2] Li, Y., Hsu, D. F., & Chung, S. M. (2009, November). Combining multiple feature selection methods for text categorization by using rank-score characteristics. In Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on (pp. 508-517). IEEE.

      [3] Rong, C. (2011, November). Using Mahout for clustering Wikipedia's latest articles: A comparison between k-means and fuzzy c-means in the cloud. In Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on (pp. 565-569). IEEE.

      [4] Ackerman, M & Dasgupta, S 2014, ‘Incremental clustering: The case for extra clusters’, In NIPS, pp. 307-315.

      [5] Altingovde, S, Subakan, NÖ & Ulusoy, Ö 2013 ‘Cluster searching strategies for collaborative recommendation systems’, Information Processing and Management, vol. 49, no. 3, pp. 688-697. https://doi.org/10.1016/j.ipm.2012.07.008.

      [6] Bilge & Polat, H 2013 ‘A scalable privacy-preserving recommendation scheme via bisecting k-means clustering’, Information Processing and Management, vol. 49, no. 4, pp. 912-927.

      [7] Caraballo, S 1999, ‘Automatic Acquisition of a hypernym-labeled noun hierarchy from text’, In Proceedings of the Association for Computational Linguistics Conference.

      [8] Charikar, M, Chekuri, C, Feder, T & Motwani, R 1997, ‘Incremental clustering and dynamic information retrieval’, 29th Symposium on Theory of Computing, pp. 626-635.

      [9] Chen, X, Liu, X, Huang, Z & Sun, H 2010, ‘Regionknn: A scalable hybrid collaborative filtering algorithm for personalized Web service recommendation’, Proc. Eighth Int’l Conf. Web Services ICWS ’10, pp. 9-16. https://doi.org/10.1109/ICWS.2010.27.

      [10] Cutting, DR, Karger, DR, Pedersen, JO & Tukey, JW 1992, ‘Scatter/gather: a cluster-based approach to browsing large document collections’, In SIGIR ’92, New York, NY, USA, ACM, pp. 318-329. https://doi.org/10.1145/133160.133214.

      [11] Devender, A, Srinivas, B & Ashok A 2015, ‘Efficient Incremental Clustering of Documents based on Correlation’, International Journal of Engineering and Computer Science ISSN: 2319-7242, vol. 4, no. 8, pp. 13704-13709.

      [12] Ester, M, Kriegel, HP, Sander, J, Wimmer, M & Xu, X 1998, ‘Incremental clustering for mining in a Data Warehousing environment’, Proc. of the 24th Int. Conf. on Very Large Databases VLDB’98, New York, USA, pp. 323-333.

      [13] Fisher, D 1987, ‘Knowledge acquisition via incremental conceptual clustering’, Machine Learning, vol. 2, pp. 139-172. https://doi.org/10.1007/BF00114265.

      [14] George, T & Meregu, S 2005, ‘A scalable collaborative filtering Framework based on co-clustering’, in: IEEE International Conference on Data Mining ICDM, 2005, pp. 625-628. https://doi.org/10.1109/ICDM.2005.14.

      [15] Geraci, F, Maggini, M, Pellegrini, M & Sebastiani, F 2007, ‘Cluster generation and cluster labelling for web snippets: a fast and accurate hierarchical solution’, Internet Mathematics.

      [16] Gennary P Langley & Fisher, D 1989, ‘Model of Incremental Concept Formation’, Artificial Intelligence Journal, vol. 40, pp. 11-61.

      [17] Glover, E, Pennock, DM, Lawrence, S & Krovetz, R 2002, ‘Inferring hierarchical descriptions. In CIKM ’02, New York, NY, USA, ACM, pp. 507-514. https://doi.org/10.1145/584792.584876.

      [18] Li, Y., Hsu, D. F., & Chung, S. M. (2009, November). Combining multiple feature selection methods for text categorization by using rank-score characteristics. In Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on (pp. 508-517). IEEE.

      [19] Lloyd, SP 1982, ‘Least squares quantization in PCM’, IEEE Transactions on Information Theory, vol. 282, pp. 129-137. https://doi.org/10.1109/TIT.1982.1056489.

      [20] Manning, CD, Raghavan, P & Schutze, H 2008, ‘Introduction to Information Retrieval’, Cambridge University Press. https://doi.org/10.1017/CBO9780511809071.

      [21] Morris, J & Hirst, G 1991, ‘Lexical cohesion computed by the saural relations as an indicator of the structure of text’, Computational Linguistics, vol. 171, pp. 21-48.

      [22] Pantel, P & Ravichandran, D 2004, ‘Automatically labelling semantic classes’, In Proceedings of the Human Language Technology and North American Chapter of the Association for Computational Linguistics Conference.

      [23] Radev, DR, Jing, H, Stys, M & Tam, D 2004, ‘Centroid-based summarization of multiple documents’, Information Processing Management, vol. 406, pp. 919-938. https://doi.org/10.1016/j.ipm.2003.10.006.

      [24] Rong, C. (2011, November). Using Mahout for clustering Wikipedia's latest articles: A comparison between k-means and fuzzy c-means in the cloud. In Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on (pp. 565-569). IEEE.

      [25] Toda, H & Kataoka, R 2005, ‘A clustering method for news articles retrieval system’. In WWW ’05, New York, NY, USA, ACM, pp. 988-989. https://doi.org/10.1145/1062745.1062832.

      [26] Treeratpituk, P & Callan, J 2006, ‘Automatically labelling hierarchical clusters’, In DG. O ’06, New York, NY, USA, ACM, pp. 167-176. https://doi.org/10.1145/1146598.1146650.

      [27] Tsai, CF & Hung, C 2012 ‘Cluster ensembles in collaborative filtering recommendation’, Applied Soft Computing Journal, vol. 12, no. 4, pp. 1417-1425. https://doi.org/10.1016/j.asoc.2011.11.016.

      [28] Tseng, YH 2010, ‘Generic title labelling for clustered documents’, Expert Systems with Applications, vol. 373, pp. 2247-2254. https://doi.org/10.1016/j.eswa.2009.07.048.

      [29] Wei, T, Lu, Y, Chang, H, Zhou, Q & Bao, X 2015, ‘A semantic approach for text clustering using WordNet and lexical chains’, Expert Systems with Applications, vol. 424, pp. 2264-2275. https://doi.org/10.1016/j.eswa.2014.10.023.

      [30] Wu, J, Chen, L, Feng, Y, Zheng, Z, Zhou, MC & Wu, Z 2013, ‘Predicting quality of service for selection by neighborhood-based collaborative filtering’, IEEE Trans, Systems, Man, and Cybernetics: Systems, vol. 43, pp. 428-439. https://doi.org/10.1109/TSMCA.2012.2210409.

  • Downloads

  • How to Cite

    Kadry, S., Ahamed Mohammed Saleem, I., Kumar Ramasamy, L., & Kannammal, N. (2019). Onto based cluster labelling and incremental system for information retrieval. International Journal of Engineering & Technology, 7(4), 5699-5704. https://doi.org/10.14419/ijet.v7i4.22914