Investigating Predictive Power of Social Media Sentiment in Financial Markets


Student thesis: Doctoral Thesis

View graph of relations


Related Research Unit(s)


Awarding Institution
  • Kin Keung LAI (Supervisor)
  • Yimin YU (Supervisor)
Award date26 Mar 2020


With the development of the Internet, social media websites such as Twitter and Weibo have received a lot of attention due to their enormous users and explosive growth of user-generated data. Recently, social media websites become an extra communication medium between firms, governments, and investors and play an important role in public discussions which help to reduce information asymmetry between institutions and personal investors. However, unlike numerical data in financial reports, textual posts on social media websites are difficult to understand and use because of their narrative nature. To leverage the informative data in these textual posts, a lot of research has been conducted on sentiment analysis and opinion mining for social media data and the analysis result is applied to different aspects of financial markets.

This work utilizes sentiment analysis and modeling techniques to shed light on the underlying information of narrative social media posts. Specifically, we investigate the predictive power of social media sentiment for prices of different assets including stocks, indices, and crude oil. Two social media websites: Weibo and Twitter are used as the source of data in this research. Twitter is a microblogging website that is used globally. Similar to Twitter, Weibo is a Chinese-based social media website mainly used in China.
• Theoretical advancement. By studying several well-known information theories, we propose a new topic-based modeling theory to aggregate market signals from a large number of social media posts and guide our technical efforts in sentiment analysis and prediction models.
• Technical Innovations. Several sentiment analysis methods (for both Chinese and English) are developed to generate public mood from social media websites. A discriminating topic-based public mood learning framework is discussed and applied to predict the price movement of several assets.
• Empirical Validation. We conduct empirical studies on different data sets by examining the statistical significance of the potential informativeness. The empirical results demonstrate that public mood in social media websites is correlated with asset prices in financial markets.

The contributions of this thesis are threefold. Firstly, we extend the topic modeling theory to understand financial assets from social media data. Secondly, we develop new sentiment analysis models to analyze posts on social media websites. The sentiment analysis of large-scale textual posts enables us to investigate public mood states. Thirdly, an efficient prediction model using the topic-based public mood is proposed for forecasting asset prices. The experimental results demonstrate the effectiveness of the proposed model and we determine that public mood in social media websites is correlated with asset pricing and has predictive power for the price movements of different assets.