The Internet has dramatically changed people's everyday life. This virtual world is full of a vast of information on different websites. In this thesis, we focus on the affective website contents which contain people's opinions and emotions or can stimulate people's emotions. The typical affective website contents are consumer products reviews for, including ratings and comments. Because, people like to share their experiences on products such as movies, video games, and books. In addition to critical evaluation and comments, consumers may also assign the product several ratings to indicate its relative merit. Another type of affective website contents is online documents, such as news articles and tweets, on which people may show strong emotions after reading such documents. Such affective contents on news websites, blogs, and social networks have made significant impact on people's life. For example, consumers usually seek quality information of target products and other people's experiences from online reviews before making any purchasing decisions. As another example, governments and companies are paying increasing attentions to the public's opinions on website news events.
This thesis focuses on analysis and applications of the social sentiments in the affective website contents. Prior to the in-depth study of this problem, this thesis first present two preliminary works we have done on short text analysis since most of the affective website contents are in the form of short text. The first preliminary work is studying term weighting schemes for short text. As we know, one of the fundamental tasks in data mining and machine learning is representing objects (e.g., documents) with vectors based on the vector space model (VSM). Term weighting assigns appropriate weights to the components (such as terms) of each object when representing the object. Both unsupervised and supervised term weighting methods are investigated for short text snippets. The other preliminary work investigates the problem of calculating the similarity between short text snippets, which is increasingly an urgent task that needs to be well solved in text-related research and applications. We propose a new method for measuring the similarity between two short text snippets by comparing each of them with probabilistic topics.
The next work of this thesis is analyzing the social sentiments in online reviews. We examine the impact of social sentiments on two business problems: product sales and business site selection. We first study the longitudinal impact of online consumer reviews on hotel sales. Different from previous efforts using pure ratings, this study investigates this problem by exploring the extraction and representation of consumer sentiments within review comments in an iterative approach. Each comment will be rated with a pertinent score to reflect the consumer sentiment. Both consumer ratings and sentiment ratings are used to examine the economic effect of reviews on hotel sales. The other work analyzes the influence of consumer reviews on business site selection. In many industries, choosing an appropriate site is one of the most important decisions for firms. This study proposes a graph-based method to address the business site selection problem from a perspective of "intraspecific competition" which takes into account the fact that most business firms are not isolated but rather are connected and can be clustered as geographical agglomerations. The proposed method treats complex interconnections among business establishments as a graph, in which each link reflects the geographical distributions and quality of establishments. The quality is measured based on user reviews on different aspects such as price, service, facilities, and etc. After constructing the graph, this study applies an automatic learning algorithm to derive the optimal locale for a new business establishment.
The final part of this thesis studies the research problem of social emotion detection of news articles, which aims to identify the emotions of the public evoked by online news articles and serves as an important research topic of public opinion mining. A novel latent discriminative model (LDM) is proposed for this task. LDM uses intermediate hidden variables to model the latent structure of the input domain and defines a joint distribution over the emotions and latent variables conditioned on the observations. Furthermore, we demonstrate that social emotions are not independent but are correlated with each other by their homogeneous or heterogeneous nature. The dependencies between emotions can provide guidance to LDM in the training process to reflect more emotional connections of online news articles and therefore are beneficial to the detection of social emotions. Based on this observation, we propose the emotion dependency based latent discriminative model (eLDM) by incorporating the emotion dependency into LDM.
| Date of Award | 3 Oct 2012 |
|---|
| Original language | English |
|---|
| Awarding Institution | - City University of Hong Kong
|
|---|
| Supervisor | Wenyin LIU (Supervisor) & Qing LI (Co-supervisor) |
|---|
- Social psychology
- Web sites
- Psychological aspects
Social sentiment analysis of affective web content
QUAN, X. (Author). 3 Oct 2012
Student thesis: Doctoral Thesis