A user-oriented semantic annotation approach to knowledge acquisition and conversion

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journal

1 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Pages (from-to)393–411
Journal / PublicationJournal of Information Science
Volume43
Issue number3
Online published1 Apr 2016
Publication statusPublished - 1 Jun 2017

Abstract

Semantic annotation on natural language texts labels the meaning of an annotated element in specific contexts, and thus is an essential procedure for domain knowledge acquisition. An extensible and coherent annotation method is crucial for knowledge engineers to reduce human efforts to keep annotations consistent. This article proposes a comprehensive semantic annotation approach supported by a user-oriented markup language named UOML to enhance annotation efficiency with the aim of building a high quality knowledge base. UOML is operable by human annotators and convertible to formal knowledge representation languages. A pattern-based annotation conversion method named PAC is further proposed for knowledge exchange by utilizing automatic pattern learning. We designed and implemented a semantic annotation platform Annotation Assistant to test the effectiveness of the approach. By applying this platform in a long-term international research project for more than three years aiming at high quality knowledge acquisition from a classical Chinese poetry corpus containing 52,621 Chinese characters, we effectively acquired 150,624 qualified annotations. Our test shows that the approach has improved operational efficiency by 56.8%, on average, compared with text-based manual annotation. By using UOML, PAC achieved a conversion error ratio of 0.2% on average, significantly improving the annotation consistency compared with baseline annotations. The results indicate the approach is feasible for practical use in knowledge acquisition and conversion.

Research Area(s)

  • Knowledge engineering, methodologies and tools, ontologies, pattern learning, semi-structured data and XML