News Co-mention: A Big Data Approach for Stock Prediction

Project: Research

View graph of relations


This study proposes a new mechanism to explain stock returns movement based on news co-mentions. Research on empirical finance found that firms in the same industry tend to move in the same direction (also known as returns comovement). If investors are interested in a stock, they pay close attention to its competitive environment (e.g., industries). When favorable (unfavorable) news occurs to an industry, the price of the stock increases (decreases) accordingly. As a result, returns comovement exists. However, traditional classifications of industry are static in nature. They do not capture the evolving nature of business operations over time. For example, Microsoft was a pure software company but is now engaging in smart phones and other hardware businesses. Nowadays, as companies span across various business domains, it becomes difficult to assign them to a single industry code using traditional industry classification schemes (e.g., SIC, NAICS, and GICS). Several studies empirically show that traditional schemes are not precise to explain stock return movement. To address such limitation, we propose a new approach to identify dynamic boundary of industries based on news co-mention collected from online infomediaries (e.g., Factiva and Reuters). News is a major source of information where investors gather to make investment decision. When two firms are frequently mentioned together in news, investors are likely to pay keen attention to both companies. The frequently co-mentioned companies may form an “information habitat”.According to information diffusion theory in finance, firms in the same habitat are likely to exhibit returns comovement. Taking into account of coverage and persistency of co-mentioned news, we construct a dynamic industry classification scheme with the hope that it can outperform traditional approaches to classify firms into relevant industries. A prototype has been developed to visualize industry boundaries based on co-mention news. In the long run, we will enhance the prototype to include real-time news and heat-map visualization to alert users the immediate changes in information habitat.Our study contributes to both disciplines of Information Systems (IS) and finance.Firstly, it enriches existing IS literature on network economics by demonstrating the inferential power and economic implications of news co-mention networks. Secondly, the new industrial classification is useful to capital market research. Thirdly we use a largescale network analysis to analyze large volume of news, showcasing a big data application in finance. Finally, we provide industrial practitioners with new prediction models and visual analytic tools to forecast stock returns.


Project number9048087
Grant typeECS
Effective start/end date1/01/1724/06/20

    Research areas

  • News , Big Data , Stock Prediction , Network Analysis , Industry