Bringing Context to Financial Content Analysis

Project: Research

View graph of relations


Content analysis with textual data is increasingly popular in financial research. A popular methodology is that a researcher counts the frequencies of positive or negative words in a text from sources like financial newspapers and uses that as a content measure. Then the researcher empirically investigates that measure’s statistical relationship with other economic variables. However, that methodology fails to account for economic contexts and thus fails to explain clearly what and how economic information is reflected in such content. An economic theory that explains such content measures is necessary, but the literature is still lacking.Based on my dissertation, I propose a novel method for analyzing financial textual data using economic theory. My dissertation shows that such frequency-based measures can be written as an analytical expression of the economic fundamentals and contextual variables such as market beliefs, investment opportunities in different economic states, and investors’ preferences. That relationship is the asymptotic limit of an equilibrium derived in a benchmark static setting. It captures the channel that content writers, such as business news editors, must report information selectively due to physical limits, such as the newspaper space, broadcast time slots, or conventional article lengths and models how selective coverage strategies change with the economic contexts. That model works well in predicting stylized facts about news content biases, such as sensationalism or appealing to the audience’s beliefs or preferences. Its tractability makes it convenient for empirical applications.This proposed project further completes the benchmark model by incorporating more realistic and detailed assumptions. I further account for the effects of the speed of the news cycles, which is important because, with different intensities of new shocks in booms versus busts, the reporting strategies will be different. I also extend the model to include dynamics, aiming to capture decisions on reporting timeliness. I further analyze the effects of the preference differences between the sender and the audience. Using the benchmark model with its extended developments, I proceed to empirically evaluate past literature on their potential misspecification problems and then develop a novel method to estimate a model for texts structurally using economic theory, a first in the literature.The contribution of the research will be that it fills the gap in the literature by modeling context in content analysis specific to finance. It is the first paper, to the best of my knowledge, that models texts for finance from a microeconomic perspective.


Project number9048284
Grant typeECS
StatusNot started
Effective start/end date1/01/24 → …