Unlocking the potential of veterinary clinical records using a data mining approach

Research output: Conference PapersRGC 32 - Refereed conference paper (without host publication)peer-review

View graph of relations

Author(s)

  • W. Kwok
  • N. Kennedy
  • D. Brodbelt
  • J. Speelman

Detail(s)

Original languageEnglish
Publication statusPublished - Nov 2018

Conference

Title15th International Symposium of Veterinary Epidemiology and Economics (ISVEE 15)
LocationEmpress Chiang Mai Hotel
PlaceThailand
CityChiang Mai
Period12 - 16 November 2018

Abstract

Objectives: Clinical records from veterinary practices are a rich source of data for epidemiological research. These data are typically stored in veterinary practice databases in free-text, narrative and unstructured format, which are difficult to analyse using standard computational and epidemiological methods. There are a number of challenges associated with processing large volumes of such unstructured data where manual examination of records is not feasible, e.g. the lack of standardization in data entry and format, the use of different terms to describe the same concept (syntactic irregularities) and extraction of contextual information from free text. Data mining approaches have been relatively little used for accessing veterinary clinical records, in contrast to human health records, where medical informatics techniques are well-established. In this study, a natural language processing (NLP) and text mining methodology is presented to transform unstructured veterinary clinical data into structured data, to make them accessible for further analysis.

Materials and methods: Clinical entries are categorised by diagnosis and body system via an NLP and data mining approach. Standard veterinary ontologies such as Veterinary Nomenclature (VeNom) are used as a framework. These structured data are then systematically evaluated by applying algorithms for machine learning and casebased reasoning. The data are also compiled into a relational database allowing evaluation using SQL.

Results: Implementation of the method in R and Python will be described, and its efficiency and accuracy tested using a database of small animal clinical records comprising over 5 million entries.

Conclusion: The use of NLP and text mining to convert unstructured veterinary clinical records into structured data, extract clinical context and classify diagnoses can make available a wealth of data which are currently inaccessible for veterinary epidemiological analysis.

Research Area(s)

  • Medical informatics, natural language processing, clinical records, veterinary ontology

Citation Format(s)

Unlocking the potential of veterinary clinical records using a data mining approach. / Kwok, W.; Kennedy, N.; Brodbelt, D. et al.
2018. Paper presented at 15th International Symposium of Veterinary Epidemiology and Economics (ISVEE 15), Chiang Mai, Thailand.

Research output: Conference PapersRGC 32 - Refereed conference paper (without host publication)peer-review