A comparison study of topic modeling based literature analysis by using full texts and abstracts of scientific articles : a case of COVID-19 research
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Journal / Publication | Library Hi Tech |
Online published | 10 May 2022 |
Publication status | Online published - 10 May 2022 |
Link(s)
DOI | DOI |
---|---|
Document Link | |
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85129490795&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(a5b7879d-d0ac-488b-b7a0-f09d7cee2948).html |
Abstract
Purpose - How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to uncover latent thematic structures from large collections of documents, is a widespread approach in literature analysis, especially with the rapid growth of academic literature. In this paper, a comparison of topic modeling based literature analysis has been done using full texts and abstracts of articles.
Design/methodology/approach - The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet allocation (LDA) and topic visualization method.
Findings - The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.
Originality/value - First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.
Design/methodology/approach - The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet allocation (LDA) and topic visualization method.
Findings - The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.
Originality/value - First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.
Research Area(s)
- Literature analysis, Research trend, Topic modeling, Topic similarity
Citation Format(s)
A comparison study of topic modeling based literature analysis by using full texts and abstracts of scientific articles : a case of COVID-19 research. / Cao, Qiang; Cheng, Xian; Liao, Shaoyi.
In: Library Hi Tech, 10.05.2022.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review