User-concerned actionable hot topic mining : enhancing interpretability via semantic–syntactic association matrix factorization
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 50-65 |
Journal / Publication | Journal of Electronic Business & Digital Economics |
Volume | 1 |
Issue number | 1/2 |
Online published | 13 Oct 2022 |
Publication status | Published - 2022 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(a61438ec-c5d5-4b3b-9349-a37e07ea606d).html |
Abstract
Purpose - Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.
Design/methodology/approach - This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.
Findings - The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.
Originality/value - Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components. © Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan. Published in Journal of Electronic Business & Digital Economics. Published by Emerald Publishing Limited.
Design/methodology/approach - This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.
Findings - The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.
Originality/value - Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components. © Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan. Published in Journal of Electronic Business & Digital Economics. Published by Emerald Publishing Limited.
Research Area(s)
- Phrase-based hot topic mining, User-concerned action element, Word embedding, Matrix factorization
Citation Format(s)
User-concerned actionable hot topic mining: enhancing interpretability via semantic–syntactic association matrix factorization. / Wang, Linzi; Li, Qiudan; Xu, Jingjun David et al.
In: Journal of Electronic Business & Digital Economics, Vol. 1, No. 1/2, 2022, p. 50-65.
In: Journal of Electronic Business & Digital Economics, Vol. 1, No. 1/2, 2022, p. 50-65.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Download Statistics
No data available