Attributes Correspondence Discovery in Ontology Instance-based Matching and RDF Data Linkage using Clustering Method

  • Authors

    • Mansir Abubakar
    • Hazlina Hamdan
    • Norwati Mustapha
    • Teh Noranis Mohd Aris
    2018-12-09
    https://doi.org/10.14419/ijet.v7i4.31.23383
  • Ontology Matching, Instance Matching, Clustering Method, Feature Value Identification, Attribute Discovery, Mapping Generation.
  • One important aspect of ontology instance matching process is elements or attributes discovery. It specifies element correspondences in order to produce potential matching elements; otherwise, all elements of a class in the source ontology have to be compared with all elements of class in the target ontology. This heavy comparison is time-consuming and resulted in the poor performance of the matching system and makes the matching incomplete. Matching two or more ontologies and RDF datasets requires complete instance matching so as to establish logically equivalent relation among semantically related entities of the data sources. This deems challenging because of the existence of semantic heterogeneity and presence of irregular data in the RDF data sources which makes elements discovery and feature value extractions difficult. Thus, we proposed a four-step elements discovery method that utilizes unsupervised K-Medoids clustering algorithm in discovering potential matching elements pairs. To ensure generalization, we take unsupervised Canopy Clustering method to be the baseline for our evaluation. In terms of scalability, our method outperforms the baseline method with approximately 99% in both Pair Completeness and Reduction Ratio as against 60% and 86% respectively in the baseline. In mapping pattern generation, our method also outperforms the baseline algorithm with the overall F-Measure of ~91% against ~85%. The result of comparism with other methods justifies the significance effect of clustering attributes in the initial stage of the instance matching which can save about 50% of the comparism.

     

     

  • References

    1. [1] Gracia J, Mena E, Semantic heterogeneity issues on the web, IEEE Internet Computing, Vol. 16, No. 5, (2012), pp. 60–67.

      [2] Li L, Xing H, Xia H, Huang X, Entropy-Weighted instance matching between different sourcing points of interest, Entropy, Vol. 18, No. 2, (2016), pp. 1–15.

      [3] Isaac A, Van der Meij L, Schlobach S, Wang S, An empirical study of instance-based ontology matching, Belgian/Netherlands Artificial. Intelligence Conference, (2008), pp: 317–318.

      [4] Castano S, Ferrara A, Montanelli S, Varese G, Ontology and instance matching, Lecture Notes Computer Sciience (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Vol. 6050, (2011), pp. 167–195.

      [5] Chen F, Lu C, Wu H, Li M, A semantic similarity measure integrating multiple conceptual relationships for web service discovery, Expert Systems with Application, Vol. 67, (2017), pp. 19–31.

      [6] Mccallum A, Ungar LH, Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching, Vol. 64, No. (2013), pp. 213–223.

      [7] Choi DW, Chung CW, A K-partitioning algorithm for clustering large-scale spatio-textual data, Information System, Vol. 64, No. June 2016, (2017), pp. 1–11.

      [8] Pawar P, Tokmakoff A, Ontology-based context-aware service discovery for pervasive environments, Proc. First IEEE Int. Workshop on Service Integration Pervasive Environ., 2006.

      [9] Arch-Int N, Arch-Int S, Semantic Ontology Mapping for Interoperability of Learning Resource Systems using a rule-based reasoning approach, Expert Systems with Applications, Vol. 40, No. 18, (2013), pp. 7428–7443.

      [10] Liu L, Yang F, Zhang P, Wu JY, and Hu L, SVM-based ontology matching approach, International Journal of Automative Computing, Vol. 9, No. 3, (2012), pp. 306–314.

      [11] Bilgin AS, Singh MP, A DAML-based repository for QoS-aware semantic Web service selection, Proc. - IEEE International Conference on Web Services, (2004), pp. 368–375.

      [12] Farooq A, Ahsan S, Shah A, An Efficient Technique for Similarity Identification between Ontologies, Computing, Vol. 2, No. 6, (2010), pp. 147–155.

      [13] Rong S, Niu X, Xiang EW, Wang H, Yang Q, A Machine Learning Approach for Instance Matching Based on Similarity Metrics, 11th International Semantic Web Conference, (2012), pp. 1–16.

      [14] Cruz IF, Palmonari M, Caimi F, Stroe C, Building linked ontologies with high precision using subclass mapping discovery, Artificial Intelligence Review, Vol. 40, No. 2, (2013), pp. 127–145.

      [15] Li J, Wang Z, Zhang X, Tang J, Large scale instance matching via multiple indexes and candidate selection, Knowledge-Based Systems, Vol. 50, (2013), pp. 112–120.

      [16] Zhao L, Ichise R, Ontology Integration for Linked Data, Journal of Data Semantics, Vol. 3, No. 4, (2014), pp. 237–254.

      [17] Diallo G, An effective method of large scale ontology matching, Journal of Biomedical Semantics, Vol. 5, No. 1, (2014), p. 44.

      [18] Kejriwal M, Miranker DP, Semi-supervised instance matching using boosted classifiers, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9088, (2015), pp. 388–402.

      [19] Gherbi S, Khadir MT, Inferred Ontology Concepts Alignment Using Instances and an External Dictionary, Procedia Computer Scince, Vol. 83, No. Ant, (2016), pp. 648–652.

      [20] Teh Noranis MA, Mansir A, Hazlina H, Norwati M, Instance-Based Ontology Matching : A Literature Review, in Recent Advances on Soft Computing and Data Mining, Advances in Intelligent Systems and Computing, (2018), pp. 456–469.

      [21] Noy NF, McGuinness DL, Ontology Development 101: A Guide to Creating Your First Ontology, Stanford Knowledge Systems Lab., (2001), pp. 25, 2001.

      [22] Cao F, Huang JZ, Liang J, A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes, Applied Mathematical Computing, Vol. 295, (2017), pp. 1–15.

      [23] Hopke PK, The Use of Sampling to Cluster Large Data Sets, Chemom. Intelligent. Laboratory Sytems, Vol. 8, (1990), pp. 195–204.

      [24] Huang Z, Extensions to the k -Means Algorithm for Clustering Large Data Sets with Categorical Values, Data Mining Knowledge Discovery, Vol. 304, No. 2, (1998), pp. 283–304.

      [25] Fan Z, Euzenat J, Scharffe F, Learning concise pattern for interlinking with extended version space, Proc. - 2014 IEEE/WIC/ACM International. Joint. Conference of Web Intelligence and Intelligent Agent Technology Workshop, WI-IAT 2014, Vol. 1, No. 3, (2014), pp. 189–204.

      [26] Cerón-Figueroa S, et al., Instance-based ontology matching for e-learning material using an associative pattern classifier, Computers in Human Behaviour, Vol. 69, (2017), pp. 218–225.

      [27] Wagstaff K, Rogers S, Schroedl S, Constrained K-means Clustering with Background Knowledge, Expert Systems with Applications, (2001), pp. 577–584.

      [28] Greenacre M, Primicerio R, Measures of distance between samples: Euclidean, Multivariant Analytic Ecological Data, (2013), pp. 47–59.

      [29] Garcia E, Co A, Cosine Similarity Tutorial, Information Retrieval Intelligence, (2015), pp. 4–10.

      [30] Noy N, Ontology Mapping and Alignment, 3rd Summer School. Ontology Engineering, (2005), pp. 48.

      [31] Jiménez-Ruiz E, Cuenca GB, LogMap: Logic-based and scalable ontology matching, Lecture Notes in Computer Science (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Vol. 7031 LNCS, No. PART 1, (2011), pp. 273–288.

      [32] Khiat A, Benaissa M, InsMT / InsMTL Results for OAEI 2014 Instance Matching, in CEUR Workshop Proceedings (Vol. 1545, ). CEUR-WS., (2014), pp. 158–161.

      [33] Shao C, Hu LM, Li JZ, Wang ZC, Chung T, Xia JB, RiMOM-IM: A Novel Iterative Framework for Instance Matching, Journal of Computer Science Technoogy, Vol. 31, No. 1, (2016), pp. 185–197.

  • Downloads

  • How to Cite

    Abubakar, M., Hamdan, H., Mustapha, N., & Noranis Mohd Aris, T. (2018). Attributes Correspondence Discovery in Ontology Instance-based Matching and RDF Data Linkage using Clustering Method. International Journal of Engineering & Technology, 7(4.31), 290-297. https://doi.org/10.14419/ijet.v7i4.31.23383