With the explosive growth in the amount of user-generated review data in the era of
Web 2.0 (social networking sites, blogs, mini-blogs, discussion forums, online
shopping websites), there is a pressing need to develop effective methods and tools to
automatically extract valuable business intelligence from these user opinion data. In
addition, the large numbers of user reviews often contain information about
competitors and have become a new source for mining competitive intelligence. Many
companies have begun to use social networking sites (SNS) as an important channel
and platform to do online marketing and reputation management. Thus, analyzing
users' sentiments on SNS has become key for these business applications.
Semantically annotating opinion data is an effective way to mine valuable information
from the large number of customer opinions. Although supervised machine learning
approaches have been explored for semantically annotating user opinions to facilitate
market intelligence generation, such approaches often require numerous manually
labelled training examples to produce accurate semantic annotations. In this study, we
propose an active learning approach that can train a state-of-the-art, large-margin
classifier with substantially fewer labelled training examples, and yet produce accurate
semantic annotations of user opinions. In particular, our active learning method is
underpinned by a novel query function that can efficiently locate the most informative
unlabeled examples such that a large-margin classifier can learn the optimal parameter
values based on them. Rigorous evaluation involving a benchmark test and an
empirical test with real-world opinion data extracted from Amazon.com reveals that
the proposed active learning method can train effective classifiers with far fewer training examples and yet achieve similar performance to a typical state-of-the-art
classifier without active learning.
For effectively mining competitive intelligence from customer opinions of Web 2.0,
we propose a novel graphical model to extract and visualize comparative relations
between products from customer reviews, with the interdependencies among relations
taken into consideration, to help companies discover potential risks and further design
new products and marketing strategies. Our experiments on a corpus of Amazon
customer reviews show that our proposed method can extract comparative relations
more accurately than the benchmark methods.
To analyze users' sentiments on SNS, a "sentiment community" is proposed as a tool.
The sentiment communities with different polarities on SNS usually represent groups
of users with certain preferences in common. Thus, discovering sentiment
communities is very useful for enterprises to do customer segmentation and target
marketing. A novel method based on an optimization technique is proposed for
discovering users' sentiment communities, and comprehensive experimental
evaluations are executed to demonstrate the method's effectiveness.
In summary, this dissertation covers the topics of semantically annotating opinion data,
extracting comparative opinions, and analyzing users' sentiments on SNS. This work
opens the door to analyzing the rich consumer-generated data of Web 2.0 and SNS for
enterprises to use in business applications.
Date of Award | 3 Oct 2011 |
---|
Original language | English |
---|
Awarding Institution | - City University of Hong Kong
|
---|
Supervisor | Shaoyi Stephen LIAO (Supervisor) |
---|
- Consumer satisfaction
- Web 20
- Data mining
- Evaluation
Mining and analyzing customer opinions/sentiments of Web 2.0 for business applications
XU, K. (Author). 3 Oct 2011
Student thesis: Doctoral Thesis