Onto based cluster labelling and incremental  system for information retrieval

Seifedine Kadry; Irfan Ahamed Mohammed Saleem; Lakshmana Kumar Ramasamy; N. Kannammal

doi:10.14419/ijet.v7i4.22914

Authors

Seifedine Kadry
Beirut Arab University
Irfan Ahamed Mohammed Saleem
Hindusthan College of Engineering and Technology
Lakshmana Kumar Ramasamy
Nehru Institute of Engineering and Technology
N. Kannammal
Surya Engineering College

Received date: December 2, 2018

Accepted date: March 29, 2019

Published date: April 12, 2019

DOI:

https://doi.org/10.14419/ijet.v7i4.22914

Keywords:

Semantic Document Clustering, K-Means, Labels, Synsets.

Abstract

Document clustering is utilised for data retrieval, past task of labels to cluster individuals enhances quick retrieval, and Existing framework doles out labels because of standard terms show in archives. However, semantic marks are first taking into Document semantic relationship, the incremental calculation for the versatile framework. The proposed work allocates onto mark because of scientific categorisation, i.e. ontology-based; word net Synsets and gleams coordinating and incremental dynamicity is accomplished through naming. The assessment is done utilising f-measure and figuring speed, contrasted and benchmark K-Means, K-Means without labels. Thus semantic labelling is designed more efficient than traditional document clustering methodologies and can be implemented for real-time internet document clustering applications.
Â
Â

References

[1] Ponnusamy, R., Degife, W. A., & Alemu, T. (2018). Recommender Frameworks Outline System Design and Strategies: A Review. In Knowledge Computing and its Applications (pp. 261-285). Springer, Singapore. https://doi.org/10.1007/978-981-10-8258-0_12.
[2] Li, Y., Hsu, D. F., & Chung, S. M. (2009, November). Combining multiple feature selection methods for text categorization by using rank-score characteristics. In Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on (pp. 508-517). IEEE.
[3] Rong, C. (2011, November). Using Mahout for clustering Wikipedia's latest articles: A comparison between k-means and fuzzy c-means in the cloud. In Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on (pp. 565-569). IEEE.
[4] Ackerman, M & Dasgupta, S 2014, â€˜Incremental clustering: The case for extra clustersâ€™, In NIPS, pp. 307-315.
[5] Altingovde, S, Subakan, NÃ– & Ulusoy, Ã– 2013 â€˜Cluster searching strategies for collaborative recommendation systemsâ€™, Information Processing and Management, vol. 49, no. 3, pp. 688-697. https://doi.org/10.1016/j.ipm.2012.07.008.
[6] Bilge & Polat, H 2013 â€˜A scalable privacy-preserving recommendation scheme via bisecting k-means clusteringâ€™, Information Processing and Management, vol. 49, no. 4, pp. 912-927.
[7] Caraballo, S 1999, â€˜Automatic Acquisition of a hypernym-labeled noun hierarchy from textâ€™, In Proceedings of the Association for Computational Linguistics Conference.
[8] Charikar, M, Chekuri, C, Feder, T & Motwani, R 1997, â€˜Incremental clustering and dynamic information retrievalâ€™, 29th Symposium on Theory of Computing, pp. 626-635.
[9] Chen, X, Liu, X, Huang, Z & Sun, H 2010, â€˜Regionknn: A scalable hybrid collaborative filtering algorithm for personalized Web service recommendationâ€™, Proc. Eighth Intâ€™l Conf. Web Services ICWS â€™10, pp. 9-16. https://doi.org/10.1109/ICWS.2010.27.
[10] Cutting, DR, Karger, DR, Pedersen, JO & Tukey, JW 1992, â€˜Scatter/gather: a cluster-based approach to browsing large document collectionsâ€™, In SIGIR â€™92, New York, NY, USA, ACM, pp. 318-329. https://doi.org/10.1145/133160.133214.
[11] Devender, A, Srinivas, B & Ashok A 2015, â€˜Efficient Incremental Clustering of Documents based on Correlationâ€™, International Journal of Engineering and Computer Science ISSN: 2319-7242, vol. 4, no. 8, pp. 13704-13709.
[12] Ester, M, Kriegel, HP, Sander, J, Wimmer, M & Xu, X 1998, â€˜Incremental clustering for mining in a Data Warehousing environmentâ€™, Proc. of the 24th Int. Conf. on Very Large Databases VLDBâ€™98, New York, USA, pp. 323-333.
[13] Fisher, D 1987, â€˜Knowledge acquisition via incremental conceptual clusteringâ€™, Machine Learning, vol. 2, pp. 139-172. https://doi.org/10.1007/BF00114265.
[14] George, T & Meregu, S 2005, â€˜A scalable collaborative filtering Framework based on co-clusteringâ€™, in: IEEE International Conference on Data Mining ICDM, 2005, pp. 625-628. https://doi.org/10.1109/ICDM.2005.14.
[15] Geraci, F, Maggini, M, Pellegrini, M & Sebastiani, F 2007, â€˜Cluster generation and cluster labelling for web snippets: a fast and accurate hierarchical solutionâ€™, Internet Mathematics.
[16] Gennary P Langley & Fisher, D 1989, â€˜Model of Incremental Concept Formationâ€™, Artificial Intelligence Journal, vol. 40, pp. 11-61.
[17] Glover, E, Pennock, DM, Lawrence, S & Krovetz, R 2002, â€˜Inferring hierarchical descriptions. In CIKM â€™02, New York, NY, USA, ACM, pp. 507-514. https://doi.org/10.1145/584792.584876.
[18] Li, Y., Hsu, D. F., & Chung, S. M. (2009, November). Combining multiple feature selection methods for text categorization by using rank-score characteristics. In Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on (pp. 508-517). IEEE.
[19] Lloyd, SP 1982, â€˜Least squares quantization in PCMâ€™, IEEE Transactions on Information Theory, vol. 282, pp. 129-137. https://doi.org/10.1109/TIT.1982.1056489.
[20] Manning, CD, Raghavan, P & Schutze, H 2008, â€˜Introduction to Information Retrievalâ€™, Cambridge University Press. https://doi.org/10.1017/CBO9780511809071.
[21] Morris, J & Hirst, G 1991, â€˜Lexical cohesion computed by the saural relations as an indicator of the structure of textâ€™, Computational Linguistics, vol. 171, pp. 21-48.
[22] Pantel, P & Ravichandran, D 2004, â€˜Automatically labelling semantic classesâ€™, In Proceedings of the Human Language Technology and North American Chapter of the Association for Computational Linguistics Conference.
[23] Radev, DR, Jing, H, Stys, M & Tam, D 2004, â€˜Centroid-based summarization of multiple documentsâ€™, Information Processing Management, vol. 406, pp. 919-938. https://doi.org/10.1016/j.ipm.2003.10.006.
[24] Rong, C. (2011, November). Using Mahout for clustering Wikipedia's latest articles: A comparison between k-means and fuzzy c-means in the cloud. In Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on (pp. 565-569). IEEE.
[25] Toda, H & Kataoka, R 2005, â€˜A clustering method for news articles retrieval systemâ€™. In WWW â€™05, New York, NY, USA, ACM, pp. 988-989. https://doi.org/10.1145/1062745.1062832.
[26] Treeratpituk, P & Callan, J 2006, â€˜Automatically labelling hierarchical clustersâ€™, In DG. O â€™06, New York, NY, USA, ACM, pp. 167-176. https://doi.org/10.1145/1146598.1146650.
[27] Tsai, CF & Hung, C 2012 â€˜Cluster ensembles in collaborative filtering recommendationâ€™, Applied Soft Computing Journal, vol. 12, no. 4, pp. 1417-1425. https://doi.org/10.1016/j.asoc.2011.11.016.
[28] Tseng, YH 2010, â€˜Generic title labelling for clustered documentsâ€™, Expert Systems with Applications, vol. 373, pp. 2247-2254. https://doi.org/10.1016/j.eswa.2009.07.048.
[29] Wei, T, Lu, Y, Chang, H, Zhou, Q & Bao, X 2015, â€˜A semantic approach for text clustering using WordNet and lexical chainsâ€™, Expert Systems with Applications, vol. 424, pp. 2264-2275. https://doi.org/10.1016/j.eswa.2014.10.023.
[30] Wu, J, Chen, L, Feng, Y, Zheng, Z, Zhou, MC & Wu, Z 2013, â€˜Predicting quality of service for selection by neighborhood-based collaborative filteringâ€™, IEEE Trans, Systems, Man, and Cybernetics: Systems, vol. 43, pp. 428-439. https://doi.org/10.1109/TSMCA.2012.2210409.

Onto based cluster labelling and incremental system for information retrieval

Authors

Seifedine Kadry

Irfan Ahamed Mohammed Saleem

Lakshmana Kumar Ramasamy

N. Kannammal

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite