This research investigates the terminological data in terminology databases (termbases) and
in corresponding corpora from commercial sources. Four companies in the information
technology (IT) sector are used as case studies. Our broad objective is to increase awareness
about some of the issues and challenges faced by terminologists in commercial
settings. We demonstrate that there are significant gaps between the termbases and the
corresponding corpora, that such gaps reduce the effectiveness of the termbases, and that
they can be minimised by adopting a corpus-based approach to term identification.
We begin by establishing that the language used in a company contains terminology. After
reviewing the conventional theories and methodologies of the field of terminology, we
challenge the suitability of some of their precepts for companies that require terminological
resources that are both repurposable and production-oriented. We then reveal features in the
termbases that depart from established norms. Using a batch concordance technique, we
quantify the gap between the termbase terms and the corpora. We then attempt to explain
this gap by examining termbase terms that occur in various frequency ranges within the
corpora. Using empirical observations, we formulate some guiding principles for selecting
terms for termbases with respect to various features including term length, part of speech,
term variation, and the use of certain types of modifiers.
We discover that keywords hold potential for discovering multi-word terms that, if documented
in termbases, would significantly increase the correspondence between termbases
and corpora. We conclude that termbases developed in companies would increase in value
if corpus-based approaches to term identification were adopted. This challenges the conventional
understanding of what is a term; to open the field of terminology to commercial
applications and environments, termhood needs to be established based on communicative
purpose and end-use of terminological resources in addition to purely semantic criteria.
| Date of Award | 15 Jul 2014 |
|---|
| Original language | English |
|---|
| Awarding Institution | - City University of Hong Kong
|
|---|
| Supervisor | Chengyu Alex FANG (Supervisor) |
|---|
- Commerce
- Terminology
- Corpora (Linguistics)
Narrowing the gap between termbases and corpora in commercial environments
WARBURTON, K. C. (Author). 15 Jul 2014
Student thesis: Doctoral Thesis