Statistical models for social emotion and emerging event detection from online news

  • Yanghui RAO

Student thesis: Doctoral Thesis

Abstract

In this thesis, we investigate several statistical models for detecting social emotions and emerging events from online news. With the broad availability of portable devices such as mobile devices and tablets, online users can now conveniently express their emotions through news portals, and their willingness to engage in social interactions increase tremendously. Facing the large-scale user-generated content, it becomes useful and necessary to detect social emotions evoked by online news automatically. Leveraging the crowd contributed data in real-world websites, a lexicon-based framework is developed to associate each word, feature or topic with a distribution on a series of emotions. To have discriminative power between affective and background topics, three joint-labeled affective topic models, i.e., the multi-label supervised topic model (MSTM), the sentiment latent topic model (SLTM), and the affective exponential topic model (AETM) are further designed to detect social emotions. Social emotion detection by affective topic modeling is challenging because it requires us to model multiple labels jointly. Both MSTM and SLTM are proposed by representing the set of social emotion ratings as a bag of emotion labels. The exponential distribution is employed to generate user ratings over each emotion label in the AETM. The proposed affective topic models can be applied to the tasks of: (i) classifying social emotions, and (ii) generating social emotion lexicons. The experimental analysis on the task of social emotion classification validates the effectiveness of our models. The generated emotional lexicons can be conveniently used to measure the public's attitudes towards people, cities, aspects, topics, and other elements of social events, in addition to support emotion-based information retrieval systems. Emerging event detection (EED) aims to detect the first news articles that discuss an emerging event, and has practical applications in many domains such as intelligence gathering, news analysis, and national security. Compared to subject-based tasks, EED is event-based and thus faces the issues of multiple events on the same subject and the evolution of events. In this thesis, we present a new statistical model of term weighting which captures the local element, global element and topical association simultaneously (i.e., LGT scheme), in addition to two nonparametric feature reduction strategies and an online model for EED. We evaluate our model on TDT5 dataset and compare it to three existing models. The results show that our approach outperforms those baselines. EED is further used to tackle the challenging problem of domain adaptation in social emotion classification, which takes the advantages of both domain-independent and domain-dependent emotion classifiers by distinguishing emerging and old events.
Date of Award3 Oct 2014
Original languageEnglish
Awarding Institution
  • City University of Hong Kong
SupervisorQing LI (Supervisor)

Keywords

  • Statistical methods
  • Electronic newspapers
  • Social aspects

Cite this

'