A new approach for finding semantic similar scientific articles

Masumeh Islami Nasab; Reza Javidan

doi:10.14419/jacst.v4i1.4012

Authors and Affiliations

Masumeh Islami Nasab Msc Student
Reza Javidan Assistant Professor in Computer Engineering and IT Department in Shiraz University of Technology

About this article

DOI:

https://doi.org/10.14419/jacst.v4i1.4012

Received:

10-12-2014

Accepted:

05-01-2015

Published:

16-02-2015

Views:

666

Downloads:

351

Download PDF

Keywords:

Similarities, Semantic Similarities, Text Preprocessing, WordNet.

Abstract

Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearsonâ€™s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.

References

[1] Sheth, A, Lytras M., "Information Retrieval by Semantic Similarity", int. journal on semantic web & information systems, 2(3), pp: 55-73. (2006).

[2] Ramprasath, M, Hariharan, Sh.,â€Using ontology for Measuring Semantic Similarity for Question Answering Systemâ€IEEE International conference on Advanced Communication control and Computing Technologies(ICACCD), pp: 218-223. (2012).

[3] Sahami, M, Heilman, T., â€œA Web-based Kernel Function for Measuring the Similarity of Short text Snippetsâ€, Proceeding of 15th International Word Wide Web Conference. (2006). http://dx.doi.org/10.1145/1135777.1135834.

[4] Madylova, A., â€œA Taxonomy based Semantic Similarity Documents Using Cosine Measureâ€, Computer an Information Sciences, IEEE,Iscis 2009.24th, International Symposium. (2009).

[5] Mihalcea, R., Corley, C, Strapparava, C., â€œCorpus-based and Knowledge-based Measures of Text Semantic Similarityâ€, Proceeding of th National Conference on Artificial Intelligence ,pages:775-780. (2006).

[6] Ghazizadeh Ahsaee, M, Naghibzadeh, M, Yasrebi Naieni, S.E., â€œWeighted Semantic Similarity Assesment Using Word Net â€, Dept. of Computer Engineering Ferdowsi University of Mashhad, Iran , International Conference on computer & Information Science(ICCIS), pp:66-71, (2012).

[7] Qasim, A, Omar, N, Albared, M., â€œCombined Statistical Methods to Measure Semantic Text Similarity in Holy QurÊ¼anic Translationsâ€, Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, university Kebang Saan Mlaysia, 43600 Bangi Selangor, Malaysia, vol5(17), pp:1-7, (2013).

[8] Huang, A., â€œSimilarity Measure for Text Document Clusteringâ€, Department of Computer Science The University of Waikato, Hamilton, New Zealand, pp:49-56, (2008).

[9] Song, W, Cheol Park, S., â€œAn Improved Genetic Algotithm for Document Clustering With Semantic Similarity Measureâ€, Division of Electronics and Information Engineering, Chonbuk National University, Jeonju, 561756, korea(IEEE), pp:536-540. (2006).

[10] Porter, M., â€œAn algorithm for suffix stripping. Programâ€.14(3), pp.130-137, (1980). http://dx.doi.org/10.1108/eb046814.

[11] Lin, F, Sandkuhl, K., â€œA Survey of Exploiting WordNet in Ontology Matchingâ€. In IFIP International Federation for Information Processing, Artificial Intelligence and Practice II; Max Bramer; (Boston: Springer), Vol 276, pages: 341â€“350, (2008).

[12] Cimiano, P., â€œOntology Learning and Population from Text: Algorithms, Evaluation and Applicationsâ€, Springer, 2006.

[13] Lin, D., â€œAn information-theoretic definition of similarityâ€. In Proceeding of the15th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, USA, pp. 296â€“304, (1998).

[14] Petrakis, E.G.M., Varelas, G., â€œDesign and evaluation of semantic similarity measures for concepts stemming from the same or different ontologiesâ€. In 4th Workshop on Multimedia Semantics (WMSâ€™06), pp. 44â€“52, (2006).

[15] Resnic, P., â€œUsing Information content to evaluate semantic similarity in a taxonomyâ€, Proceedings of IJCAI-95, vol. 1, 448-453, (1995).

[16] Anisimov, A.V., Marchenko, O.O, and Kysenko .V.K., â€œA Method for the Coputation of the Semantic Similarity and Relatedness between Natural Language Wordsâ€, Cybernetics and Systems Analysis, Vol 047, pp: 515-522, (2011). http://dx.doi.org/10.1007/s10559-011-9334-2.

How to Cite

Islami Nasab, M., & Javidan, R. (2015). A new approach for finding semantic similar scientific articles. Journal of Advanced Computer Science & Technology, 4(1), 53-59. https://doi.org/10.14419/jacst.v4i1.4012

Download Citation