Semantic similarity between ontologies at different scales

Qingpeng Zhang, David Haglin

    Research output: Journal Publications and ReviewsRGC 22 - Publication in policy or professional journal

    8 Citations (Scopus)

    Abstract

    In the past decade, existing and new knowledge and datasets have been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three gene ontology slims (plant, yeast, and candida, among which the latter two belong to the same kingdom-fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performances of Jiang-Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by 1) consistently showing that yeast and candida are more similar (as compared to plant) at different scales, and 2) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.
    Original languageEnglish
    Article number7451100
    Pages (from-to)132-140
    JournalIEEE/CAA Journal of Automatica Sinica
    Volume3
    Issue number2
    DOIs
    Publication statusPublished - 10 Apr 2016

    Research Keywords

    • biomedical informatics
    • computational biology
    • knowledge representation
    • Semantic web

    Fingerprint

    Dive into the research topics of 'Semantic similarity between ontologies at different scales'. Together they form a unique fingerprint.

    Cite this