Abstract
Along with online social media’s prosperity, the amount of user-generated reviews dramatically increases. The kinds of text-based user-generated content are conducive to estimating public sentiments. Many sentiment analysis works are based on the assumption that the sentiment expressed in online reviews can be retrieved from general text features. However, text redundancy and quantity can potentially impact the analysis performance, especially when strict corpus size constraints are applied. This paper proposes a sentiment subset selection framework to construct a small set of documents from the original corpus to convey a subjective representation. The framework can filter irrelevant sentiment information based on topic modeling and select subsets by submodular maximization with respect to a cardinality constraint. Our proposed score function can facilitate the framework to capture fine-grained sentiment features expressed in reviews compared with the conventional submodular-based one. An empirical analysis for the efficacy of the proposed sentiment subset selection framework (SentiSS) on different context domains is conducted. The comparative study of the subset’s metric impact on different sentiment levels, namely positive, neural, and negative, is also performed. Experimental results show that the SentiSS framework can compress the sentiment corpus and maintain the classifier’s performance on the metrics at the same time.
| Original language | English |
|---|---|
| Pages (from-to) | 12381–12396 |
| Journal | Neural Computing and Applications |
| Volume | 33 |
| Issue number | 19 |
| Online published | 15 Apr 2021 |
| DOIs | |
| Publication status | Published - Oct 2021 |
Research Keywords
- Sentiment analysis
- Sentiment subset selection
- Submodular function optimization
- Subset selection
Fingerprint
Dive into the research topics of 'Monotone submodular subset for sentiment analysis of online reviews'. Together they form a unique fingerprint.Projects
- 1 Finished
-
GRF: New Factorization and Multi-Label Based Matrix Completion Methods for Heterogeneous Data and Emojis Recommender System
CHOW, W. S. T. (Principal Investigator / Project Coordinator) & VERLEYSEN, M. (Co-Investigator)
1/01/21 → 21/08/24
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver