Abstract
In this paper, we describe the construction of a machine learning framework that exploit syntactic information in the recognition of biomedical terms and present the limits of machine learning in generating a novel term candidate list. Conditional random fields (CRF), is used as the basis of this framework. We make an effort to find the appropriate use of syntactic information, including parent nodes, syntactic paths and term ratios under this machine learning framework. The experiment results show that CRF model can achieve good precision in term recognition if trained with known term list. However, with regard to discovering potential novel terms for terminology lexicon editors, CRF model fails to show good performance, if trained with known term list only to predict novel terms in testing corpus. Therefore, this result suggests that more semantic information may be needed to determine a word to be a novel term during a specific period. © 2010 by Jianguo Chen, John Smith, and Kinko Yamada.
| Original language | English |
|---|---|
| Title of host publication | PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation |
| Pages | 583-592 |
| Publication status | Published - 2010 |
| Event | 24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 - Sendai, Japan Duration: 4 Nov 2010 → 7 Nov 2010 |
Conference
| Conference | 24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 |
|---|---|
| Place | Japan |
| City | Sendai |
| Period | 4/11/10 → 7/11/10 |
Research Keywords
- Conditional random fields
- Novel term recognition
- Term recognition
Fingerprint
Dive into the research topics of 'How well conditional random fields can be used in novel term recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver