Imbalanced text sentiment classification using universal and domain-specific knowledge

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

74 Scopus Citations
View graph of relations

Author(s)

  • Yijing Li
  • Haixiang Guo
  • Qingpeng Zhang
  • Mingyun Gu
  • Jianying Yang

Detail(s)

Original languageEnglish
Pages (from-to)1-15
Journal / PublicationKnowledge-Based Systems
Volume160
Online published5 Jul 2018
Publication statusPublished - 15 Nov 2018

Abstract

In this paper, a sentiment classification model is proposed to address two predominant issues in sentiment classification, namely domain-sensitive and data imbalance. Since words may embed distinct sentiment polarities in different contexts, sentiment classification is widely contended as a domain-sensitive task. Accordingly, this paper draws on label propagation to induce universal and domain-specific sentiment lexicons and builds a domain-adaptive sentiment classification model that incorporates universal and domain-specific knowledge into a unified learning framework. On the flip side, sentiment-related corpuses are usually formed with skewed polarity distribution because individuals tend to share similar assessment criteria on a given object and hence their sentiment polarities toward the same object are likely to be similar. We endeavor to address such imbalanced data problem by advancing a novel over-sampling technique. Unlike existing over-sampling approaches that generate minority-class samples from numerical feature space, the proposed sampling method directly creates synthetic texts from word spaces. Several experiments are conducted to verify the effectiveness of the proposed lexicon generation method, learning framework, and over-sampling method. Results show that the induced sentiment lexicons are interpretable and the proposed model is found to be effective for imbalanced and domain-specific text sentiment classification.

Research Area(s)

  • Ensemble learning, Imbalanced data, Label propagation, Sentiment analysis

Citation Format(s)

Imbalanced text sentiment classification using universal and domain-specific knowledge. / Li, Yijing; Guo, Haixiang; Zhang, Qingpeng et al.
In: Knowledge-Based Systems, Vol. 160, 15.11.2018, p. 1-15.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review