Constructing a linguistic resource of verbs : an ontology engineering approach
Student thesis: Doctoral Thesis
This research constructs a linguistic resource in the form of a knowledge base which automatically classifies English verbs into semantic categories referred to as types of Processes in Systemic Functional Grammar (SFG) (Halliday, 1994 & 2004). The Semantics of the verb as process is described and explained both in terms of its computational realization in an ontological framework and in terms of latent connections across various different linguistic resources. Each Process type is instantiated with English verbs. Lexical data has been imported from WordNet (Fellbaum, 1988) including more than 11,000 verb lemma and 13,650 sense entries, a total of 24,890 lemma-sense pairs. This research presents the semantics of verbs in an ontological knowledge representation. The semantics description is modeled based on SFG in a concept hierarchy in which the concepts are explicitly defined in an interrelated network environment. Lexical data included in the ontological knowledge base on the other hand not only serves as a dictionary database but also links to the SFG concepts Process which enable axiomatic exploration to the whole systemic framework defined in the knowledge base. Verb semantic classification is the main issue in the construction of the knowledge base. The problem is resolved by interoperation of two linguistic resources, WordNet and FrameNet (Baker et al, 1998), and two ontologies: Generalized Upper Model (Bateman, 1995) and Suggested Upper Merged Ontology (Niles and Pease, 2001). Cognitive categorization modeling including prototypes, feature frequency and family resemblances are applied in the implementation of automatic categorization. The four systemic Process types – Material Process, Mental Process, Verbal Process and Relational Process – form the four semantic categories and this research aims at categorizing the 11,000 verbs from WordNet into these four categories. This research also aims at defining an explicit and delicate description of the Experiential Meaning in SFG. The semantics of Experiential Meaning is ontologically defined and linked with verbs classified in the four systemic Process types forming a linguistic resource comprising both theoretical meta-concepts and intensive instances of English verbs. All exploited resources are ontologically mapped and consolidated into a unified resource. This allows the process of semantic categorization to be automated by various ontology engineering methodologies including ontology mapping, conceptual relations, semantic similarity and clustering, the application of axiom and inferences. Knowledge included in the generated knowledge base extends the mapped ontology and databases into a semantic network elaborated in terms of the semantic properties described in Systemic Functional Grammar. Data is drawn from FrameNet, WordNet and SUMO, together they provide intensive dictionary information, case frames of verbs, lexical relations among verb, encyclopedic world knowledge and the systemic semantic categories of English verbs. The constructed knowledge base explains the meaning of verbs throughout in the lexical semantic layer and the clausal conceptual layer. A web-based data accessible interface serves as a data retrieval tool for the knowledge base is developed and available at http://wcm.cityu.edu.hk/ctian/process/, with which users can pose queries about the lexicographical information of English verbs and their semantic categories.
- Ontology, Verb, English language